HoneyGPT: LLM Cybersecurity Deception

Updated 4 December 2025

HoneyGPT is a dual LLM-based framework combining an interactive terminal honeypot and a honeyword generation system using advanced prompt engineering.
It employs chain-of-thought reasoning and stateful memory management to achieve high deception fidelity, with metrics showing over 90% performance improvements compared to traditional systems.
The framework enhances flexibility, interaction depth, and realistic decoy generation, integrating rapid adaptability and PII-aware honeyword creation for robust cybersecurity defense.

HoneyGPT refers to two distinct, LLM-centric security frameworks: (1) an LLM-driven terminal honeypot architecture that advances the state of interactive cyber-deception (Wang et al., 4 Jun 2024), and (2) an LLM-based paradigm for generating honeywords—realistic, PII-aware decoy passwords for authentication systems (Yu et al., 2022). Both frameworks utilize prompt engineering and the emergent generative and reasoning capabilities of modern LLMs (e.g., GPT-3/-4) to address core shortcomings in existing deception systems, specifically breaking prevailing trade-offs in deception fidelity, adaptability, and indistinguishability.

1. LLM-Powered Terminal Honeypots: Architecture and Data Flow

The HoneyGPT honeypot substitutes conventional SSH/Telnet request–response emulation with a ChatGPT-augmented, question–answering interface. Its architecture comprises three main components:

Terminal Protocol Proxy: Derived from Cowrie’s protocol stack, manages SSH/Telnet connections, protocol negotiation, fingerprinting, and packet framing. It parses attacker packets into plain text commands, delivers them to the Prompt Manager, and encapsulates responses back for transmission.
Prompt Manager: The orchestrating control logic, responsible for assembling structured prompts to ChatGPT, updating long-term memory, pruning interaction history, and extracting multi-dimensional feedback including shell output, state-delta summaries, and a command aggressiveness score (range: 0–4).
ChatGPT Back End: Receives composite prompts containing honeypot principles, current settings, history, system state, and the newest attacker command. Leverages few-shot learning and chain-of-thought (CoT) templates to generate structured, JSON-formatted responses.

At each interaction step $i$ , the cycle is:

$(A_i, C_i, F_i) = \mathrm{ChatGPT}(P, S, H_i, SR_i, Q_i)$

where $A_i$ is the echoed shell response, $C_i$ is the system state transition summary, and $F_i$ quantifies the aggressiveness of the attacker action.

2. Structured Prompt Engineering and CoT Reasoning

The prompt construction process is central to HoneyGPT’s high-fidelity deception. Prompts are parameterized as:

$\mathrm{Prompt}_i = \{ \text{“System Principles”} : P,\, \text{“Honeypot Settings”} : S,\, \text{“Interaction History”} : H_i,\, \text{“System State Register”} : SR_i,\, \text{“Attacker Query”} : Q_i \}$

Three distinct chain-of-thought (CoT) queries are imposed:

What is the terminal response $A_i$ ?
What is the system-state change $C_i$ ?
Assign an aggressiveness score $F_i\in\{0,1,2,3,4\}$ .

Feeding the sequence of state changes $SR_i = \{C_1,\ldots,C_{i-1}\}$ forces the LLM to execute incremental decomposition of attacker pipelines (e.g., correct handling of “ps | grep miner”) and mitigates brittle, context-agnostic command failures. History and state contributions are pruned using a time-decayed importance metric $weight_j = F_j \cdot w^{(i-j)}$ with $w\in[0.7,0.9]$ , ensuring token-bound context length $L_{\max}$ compliance and retention of high-impact prior events.

3. Long-Term Memory and Interaction Coherence

HoneyGPT tracks session state along two axes:

Interaction History $H_i$ : Sequence of $(Q_j, A_j)$ command–response pairs for $j=1\ldots(i-1)$ .
System State Register $SR_i$ : Sequence of state deltas $\{C_1,\ldots,C_{i-1}\}$ reflecting system mutations over time.

After every interaction, decay-based pruning removes history entries with the lowest effective $F_j$ (impact, aged by $w$ ) to fit within the LLM input budget. This supports extended, coherent, multi-step dialogues, emulating complex attacker workflows and achieving interaction persistence surpassing classical, static code-based honeypots.

4. Embedded Security Analytics and Field Evaluation

HoneyGPT delivers real-time, multi-dimensional analytics per interaction via the $(A_i, C_i, F_i)$ triplet and further metrics:

Deception Metrics:
- $Accuracy = \frac{\mathrm{SALC}}{\mathrm{SALC}+\mathrm{SALNLC}}$
- $Temptation = \frac{\mathrm{SALC}}{\mathrm{SALC}+\mathrm{FALC}}$
- $Attack\; Success\; Rate = \frac{\mathrm{SALC}+\mathrm{SALNLC}}{\mathrm{Total}}$
- $OS\; Logic\; Compliance = \frac{\mathrm{SALC}+\mathrm{FALC}}{\mathrm{Total}}$
Novelty Rate (Field Study): $\mathrm{NoveltyRate} = \frac{|\mathrm{NewAttackVectors}|}{|\mathrm{TotalCapturedVectors}|}$ , measuring the discovery of novel attack techniques.
Risk Estimation: $R_i = \alpha F_i + \beta \Delta(\mathrm{session\_length})$ for prospective severity mapping.

Empirical results (baseline replay and real-world deployment) demonstrate:

Metric	Cowrie	HoneyGPT (GPT-4)
Attack Success Rate (%)	78	≈92
OS Logic Compliance (%)	74	≈89
Accuracy (%)	-	≈96
Temptation (%)	-	≈83
Full Session Response Rate (%)	81.9	≈99.4
Mean Session Length (%)	57.3	99.9
Novelty Rate (%)	-	≈45

HoneyGPT achieves sustained engagement, high deception accuracy, and an increased rate of novel technique capture (e.g., previously unseen ATT&CK strategies) over both rule-based emulators and standard VM baselines.

5. Trilemma Resolution: Flexibility, Depth, and Deception

Classical terminal honeypots are constrained by an inherent trilemma—simultaneously maximizing Flexibility (F), Depth of interaction (D), and Deception (C) is unattainable. HoneyGPT breaks this limitation by:

F: Prompt-based LLM simulation enables rapid switching among arbitrary OS configurations and environments without code changes, in contrast to Cowrie or Honeyd.
D: Chain-of-thought reasoning and memory yield extensible, context-rich exchanges rivalling instrumented VMs.
C: LLM emergent reasoning accommodates accurate command echoes, errors, and state reflections, achieving high fidelity.

Quantitatively, Figure 1 in (Wang et al., 4 Jun 2024) shows HoneyGPT attaining ≈95% on F, D, and C axes, and the overall utility $U = \lambda_F F + \lambda_D D + \lambda_C C$ for any weighting scheme exceeds prior systems by >30%.

6. HoneyGPT for Honeyword Generation in Authentication

The HoneyGPT framework for honeyword generation (Yu et al., 2022) leverages LLMs to produce PII-aware decoy passwords that are resistant to targeted attacks. Its main workflow involves:

PII Extraction: Identify PII substrings (e.g., username, email, date) in the real password $PW$ .
Prompt Construction: Instruct the LLM to generate $k-1$ passwords preserving the PII, resembling $PW$ , and satisfying site policies.
Candidate Filtering: Discard candidates omitting PII or duplicating $PW$ .
Tweaking (optional): Apply probabilistic symbol/digit/case perturbations, ensuring edit distance constraints.
Storage: Store the $k$ sweetwords (1 real $+~(k-1)$ honeywords), indexed for verification.

In this context, the LLM sampling process defines a constrained output space as:

$P(w|\pi) = \begin{cases} P_\theta(w|\text{prompt})/Z & \text{if } \pi \subset w\ 0 & \text{otherwise} \end{cases}$

with $Z$ a normalizing constant.

Empirical evaluation using a pilot with two experts found selection of the real password among 20 candidates to require a statistically indistinguishable number of guesses (mean ≈10.5 for LLM, ≈10.2 for tweaking). Both approaches made identification difficult when all sweetwords shared PII elements. This suggests LLM-generated honeywords significantly improve resistance to PII-based targeted attacks compared to classical methods.

7. Operational Strengths, Limitations, and Future Work

The HoneyGPT frameworks exhibit several distinct strengths:

Prompt-driven adaptability: No code modification required for environmental changes or site policy updates (honeypot simulation; honeyword generation).
Robustness to advanced threat models: LLM-assisted generation ensures decoy artifacts remain indistinguishable even under strong adversarial knowledge assumptions.
Ease of integration: Existing deception systems and credential generators can invoke HoneyGPT via API without additional model training or data leakage exposure.

Key limitations include:

Context length/token budget: Interaction history and state must be pruned to fit within model input size.
External dependencies: Reliance on commercial LLM APIs and the secrecy of prompts may present operational risks.
Empirical sample size: Human-subject studies on honeyword discrimination remain underpowered ( $N$ =2); further crowdsourced evaluation is required.

Proposed directions include constrained decoding for richer PII retention, evaluation under standardized targeted attack models, and experimentation with fine-tuned, open-access LLMs for reduced vendor reliance.

Both frameworks underscore the ability of LLMs to redefine cyber-deception techniques, bridging persistent usability and realism gaps without extensive, brittle engineering effort (Wang et al., 4 Jun 2024, Yu et al., 2022).