HoneyGPT: LLM Cybersecurity Deception
- HoneyGPT is a dual LLM-based framework combining an interactive terminal honeypot and a honeyword generation system using advanced prompt engineering.
- It employs chain-of-thought reasoning and stateful memory management to achieve high deception fidelity, with metrics showing over 90% performance improvements compared to traditional systems.
- The framework enhances flexibility, interaction depth, and realistic decoy generation, integrating rapid adaptability and PII-aware honeyword creation for robust cybersecurity defense.
HoneyGPT refers to two distinct, LLM-centric security frameworks: (1) an LLM-driven terminal honeypot architecture that advances the state of interactive cyber-deception (Wang et al., 4 Jun 2024), and (2) an LLM-based paradigm for generating honeywords—realistic, PII-aware decoy passwords for authentication systems (Yu et al., 2022). Both frameworks utilize prompt engineering and the emergent generative and reasoning capabilities of modern LLMs (e.g., GPT-3/-4) to address core shortcomings in existing deception systems, specifically breaking prevailing trade-offs in deception fidelity, adaptability, and indistinguishability.
1. LLM-Powered Terminal Honeypots: Architecture and Data Flow
The HoneyGPT honeypot substitutes conventional SSH/Telnet request–response emulation with a ChatGPT-augmented, question–answering interface. Its architecture comprises three main components:
- Terminal Protocol Proxy: Derived from Cowrie’s protocol stack, manages SSH/Telnet connections, protocol negotiation, fingerprinting, and packet framing. It parses attacker packets into plain text commands, delivers them to the Prompt Manager, and encapsulates responses back for transmission.
- Prompt Manager: The orchestrating control logic, responsible for assembling structured prompts to ChatGPT, updating long-term memory, pruning interaction history, and extracting multi-dimensional feedback including shell output, state-delta summaries, and a command aggressiveness score (range: 0–4).
- ChatGPT Back End: Receives composite prompts containing honeypot principles, current settings, history, system state, and the newest attacker command. Leverages few-shot learning and chain-of-thought (CoT) templates to generate structured, JSON-formatted responses.
At each interaction step , the cycle is:
where is the echoed shell response, is the system state transition summary, and quantifies the aggressiveness of the attacker action.
2. Structured Prompt Engineering and CoT Reasoning
The prompt construction process is central to HoneyGPT’s high-fidelity deception. Prompts are parameterized as:
Three distinct chain-of-thought (CoT) queries are imposed:
- What is the terminal response ?
- What is the system-state change ?
- Assign an aggressiveness score .
Feeding the sequence of state changes forces the LLM to execute incremental decomposition of attacker pipelines (e.g., correct handling of “ps | grep miner”) and mitigates brittle, context-agnostic command failures. History and state contributions are pruned using a time-decayed importance metric with , ensuring token-bound context length compliance and retention of high-impact prior events.
3. Long-Term Memory and Interaction Coherence
HoneyGPT tracks session state along two axes:
- Interaction History : Sequence of command–response pairs for .
- System State Register : Sequence of state deltas reflecting system mutations over time.
After every interaction, decay-based pruning removes history entries with the lowest effective (impact, aged by ) to fit within the LLM input budget. This supports extended, coherent, multi-step dialogues, emulating complex attacker workflows and achieving interaction persistence surpassing classical, static code-based honeypots.
4. Embedded Security Analytics and Field Evaluation
HoneyGPT delivers real-time, multi-dimensional analytics per interaction via the triplet and further metrics:
- Deception Metrics:
- Novelty Rate (Field Study): , measuring the discovery of novel attack techniques.
- Risk Estimation: for prospective severity mapping.
Empirical results (baseline replay and real-world deployment) demonstrate:
| Metric | Cowrie | HoneyGPT (GPT-4) |
|---|---|---|
| Attack Success Rate (%) | 78 | ≈92 |
| OS Logic Compliance (%) | 74 | ≈89 |
| Accuracy (%) | - | ≈96 |
| Temptation (%) | - | ≈83 |
| Full Session Response Rate (%) | 81.9 | ≈99.4 |
| Mean Session Length (%) | 57.3 | 99.9 |
| Novelty Rate (%) | - | ≈45 |
HoneyGPT achieves sustained engagement, high deception accuracy, and an increased rate of novel technique capture (e.g., previously unseen ATT&CK strategies) over both rule-based emulators and standard VM baselines.
5. Trilemma Resolution: Flexibility, Depth, and Deception
Classical terminal honeypots are constrained by an inherent trilemma—simultaneously maximizing Flexibility (F), Depth of interaction (D), and Deception (C) is unattainable. HoneyGPT breaks this limitation by:
- F: Prompt-based LLM simulation enables rapid switching among arbitrary OS configurations and environments without code changes, in contrast to Cowrie or Honeyd.
- D: Chain-of-thought reasoning and memory yield extensible, context-rich exchanges rivalling instrumented VMs.
- C: LLM emergent reasoning accommodates accurate command echoes, errors, and state reflections, achieving high fidelity.
Quantitatively, Figure 1 in (Wang et al., 4 Jun 2024) shows HoneyGPT attaining ≈95% on F, D, and C axes, and the overall utility for any weighting scheme exceeds prior systems by >30%.
6. HoneyGPT for Honeyword Generation in Authentication
The HoneyGPT framework for honeyword generation (Yu et al., 2022) leverages LLMs to produce PII-aware decoy passwords that are resistant to targeted attacks. Its main workflow involves:
- PII Extraction: Identify PII substrings (e.g., username, email, date) in the real password .
- Prompt Construction: Instruct the LLM to generate passwords preserving the PII, resembling , and satisfying site policies.
- Candidate Filtering: Discard candidates omitting PII or duplicating .
- Tweaking (optional): Apply probabilistic symbol/digit/case perturbations, ensuring edit distance constraints.
- Storage: Store the sweetwords (1 real honeywords), indexed for verification.
In this context, the LLM sampling process defines a constrained output space as:
with a normalizing constant.
Empirical evaluation using a pilot with two experts found selection of the real password among 20 candidates to require a statistically indistinguishable number of guesses (mean ≈10.5 for LLM, ≈10.2 for tweaking). Both approaches made identification difficult when all sweetwords shared PII elements. This suggests LLM-generated honeywords significantly improve resistance to PII-based targeted attacks compared to classical methods.
7. Operational Strengths, Limitations, and Future Work
The HoneyGPT frameworks exhibit several distinct strengths:
- Prompt-driven adaptability: No code modification required for environmental changes or site policy updates (honeypot simulation; honeyword generation).
- Robustness to advanced threat models: LLM-assisted generation ensures decoy artifacts remain indistinguishable even under strong adversarial knowledge assumptions.
- Ease of integration: Existing deception systems and credential generators can invoke HoneyGPT via API without additional model training or data leakage exposure.
Key limitations include:
- Context length/token budget: Interaction history and state must be pruned to fit within model input size.
- External dependencies: Reliance on commercial LLM APIs and the secrecy of prompts may present operational risks.
- Empirical sample size: Human-subject studies on honeyword discrimination remain underpowered (=2); further crowdsourced evaluation is required.
Proposed directions include constrained decoding for richer PII retention, evaluation under standardized targeted attack models, and experimentation with fine-tuned, open-access LLMs for reduced vendor reliance.
Both frameworks underscore the ability of LLMs to redefine cyber-deception techniques, bridging persistent usability and realism gaps without extensive, brittle engineering effort (Wang et al., 4 Jun 2024, Yu et al., 2022).