LLM Honeypots: Dynamic Decoy Systems

Updated 15 December 2025

LLM honeypots are decoy systems that use large language models to generate context-aware, dynamic responses for simulating genuine system interactions.
They integrate protocol listeners, prompt management, and stateful session maintenance to convincingly emulate services like SSH, databases, and industrial controls.
Evaluations show high deception rates and improved threat intelligence while minimizing operational risks by avoiding real command executions.

A LLM honeypot is an interactive decoy system in which all service or system responses are synthesized dynamically by an LLM, rather than emulated using traditional templates or rule-based logic. These systems are designed to increase the realism, flexibility, and operational safety of honeypots by leveraging the contextual understanding and generative capabilities of LLMs. LLM honeypots have been applied in domains ranging from SSH shells and databases to industrial control protocols and identity services. The field is characterized by rapid methodological evolution in prompt engineering, protocol adaptation, hybrid modeling, and adversarial robustness, with ongoing research focused on maximizing attack engagement and threat intelligence while minimizing operational risks.

1. Definitions, Motivation, and Core Distinctions

An LLM honeypot is a decoy service that uses an LLM to generate context-aware, realistic, and adaptive replies to attacker probes or sessions. This approach is formalized as $y_\ell = G_\theta(\text{Prompt}(x,\,\text{context}),\,z)$ , where $G_\theta$ is the LLM and “Prompt(·)” synthesizes the current input with the running context and injected deception policies (Bridges et al., 29 Oct 2025).

Key distinctions from traditional honeypots include:

Interaction Depth: Dynamic, stateful outputs can reflect extensive history and synthetic environments (filesystem, database, service banners).
Protocol and Role Flexibility: With a single model and prompt, multiple protocols (SSH, MySQL, POP3, HTTP) can be simulated at high realism (Sladić et al., 8 Oct 2025).
Operational Safety: Because no real system actions are performed—only LLM inference—there is no risk of command execution, shell breakout, or data exfiltration from production assets.
Deception Realism: LLMs can synthesize responses for edge-case or previously unseen inputs, reducing deterministic fingerprints of legacy honeypots.

Motivations for the adoption of LLM-based deception systems include an improved defense against skilled, adaptive adversaries, richer intelligence gathering on attack tactics, and the ability to safely scale honeypot deployments (Bridges et al., 29 Oct 2025, Sladić et al., 8 Oct 2025).

2. System Architectures and Design Variants

Canonical Pipeline Components

Systemic patterns across LLM honeypots combine protocol listeners, prompt managers, LLM engines, and logging modules (Sladić et al., 8 Oct 2025, Bridges et al., 29 Oct 2025):

Network protocol front-end: A Python wrapper or customized server listens on real service ports (e.g., SSH, LDAP, HTTP), parses incoming connections, and authenticates users.
Prompt management and personality injection: Each protocol is associated with a system prompt that encodes its response style, legal commands, forbidden information leaks, and output formatting. Pre-session history and per-command exchanges are maintained as part of prompt state to ensure interaction continuity.
LLM configuration: Can be fine-tuned (e.g., GPT-3.5-16k for shell simulation (Sladić et al., 8 Oct 2025)) or use zero/few-shot prompting with models such as GPT-4, Llama-3, or local LLMs (Llama, Gemma, Qwen, ByT5) (Malhotra, 1 Sep 2025, Adebimpe et al., 24 Oct 2025, Wang et al., 4 Jun 2024).
Session state and persistence: Many systems persist session history and attacker state across reconnects to maintain realism and prevent stateless response artifacts.
Output handling and logging: LLM outputs are formatted per protocol conventions, and all I/O is captured for threat intelligence and forensic analysis.

Hybrid and Domain-Specific Architectures

Variants integrate hybrid approaches:

Dictionary+LLM fallback: Fast, canned responses for frequent commands; LLM-generated outputs for novel queries (Malhotra, 1 Sep 2025).
ICSC protocols: Distinct fine-tuned byte-level models (e.g., ByT5) infer directly from protocol payloads for Modbus, S7Comm, etc. (Vasilatos et al., 9 May 2024).
LDAP protocol emulation: ASN.1/BER parsing front-end, JSON serialization to LLM, fine-tuning of LoRA adapters for directory operations (Jiménez-Román et al., 20 Sep 2025).
Retrieval-Augmented Generation (RAG): Embedding-based retrieval of relevant documentation/snippets to supplement prompt context when handling shell commands (Adebimpe et al., 24 Oct 2025).

3. Prompt Engineering and Memory Management

High-fidelity deception depends on prompt design and history management:

Protocol personality templates encode strict behavioral expectations (“You can never reveal that you are not a real database client… Always end your output with ‘mysql>’.”) (Sladić et al., 8 Oct 2025).
Chain-of-thought (CoT) instructions force the LLM to reason step-by-step about the validity, state effects, and output format of each command (Wang et al., 4 Jun 2024, Sladić et al., 2023).
Worked examples and negative prompts help the model reject invalid inputs and defend against prompt injection.
Stateful history injection through pruned buffers or score-weighted entry removal enables multi-turn coherence without exceeding context windows. HoneyGPT, for example, prunes commands with low decayed impact scores when the token budget is exceeded: dropped when $T_\text{static} + T_\text{dynamic} > C_\text{max}$ (Wang et al., 4 Jun 2024).
Explicit denial logic and output-guard constraints limit LLM exposure to adversarial queries.

4. Evaluation Methodologies and Empirical Findings

Evaluation encompasses generative fidelity, deception success, engagement metrics, and operational risk.

Generative Fidelity

Unit tests benchmark LLM outputs against ground truth using:

Passing rate: $passing\_rate = \frac{\textrm{tests passed}}{\textrm{total tests}} \times 100\%$ across protocol-specific command suites (Sladić et al., 8 Oct 2025).
String and semantic similarity metrics: Normalized Levenshtein, cosine similarity, Jaro-Winkler, BLEU-4, SBERT/BERTScore scores (Malhotra, 1 Sep 2025, Adebimpe et al., 24 Oct 2025, Wang et al., 4 Jun 2024).
Response validity: Structural/semantic checks for correct protocol behavior (e.g., 99–100% response-validity accuracy for LLMPot under Modbus/S7Comm (Vasilatos et al., 9 May 2024)).

Deception and Engagement

Human deception studies: Attackers (n=89) interacting with LLM honeypots confused with real systems at rates of $\sim$ 30% deception rate (Sladić et al., 8 Oct 2025). Fine-tuned honeypots such as shelLM achieve $\sim$ 0.92 overall accuracy in plausibly simulating shell behavior, with $\sim$ 0.91 true negative rate (fooling the user) (Sladić et al., 2023).
Mean session length: LLM honeypots extend engagement time by 55–60% over baseline (e.g., 72 vs. 45 mean commands in HoneyGPT vs. Cowrie) (Wang et al., 4 Jun 2024).
Novel TTP discovery: LLM honeypots capture unobserved attack vectors (11 new ATT&CK actions in HoneyGPT deployments) (Wang et al., 4 Jun 2024).

Operational Risk

As no commands are executed on a live OS, $R_o$ (operational risk) is minimized, with $R_o = \alpha P_{\mathrm{escape}} + \beta E_{\mathrm{impact}}$ typically negligible under current designs (Bridges et al., 29 Oct 2025).

5. Model and System Performance: Protocols, Latency, and Limitations

Protocol Breadth

Multi-protocol coverage: VelLMes supports SSH, MySQL, POP3, HTTP in a unified tool (Sladić et al., 8 Oct 2025); LLMPot covers a range of industrial protocols at the byte level (Vasilatos et al., 9 May 2024); dedicated LDAP LLM honeypots demonstrate effectiveness for organizational services (Jiménez-Román et al., 20 Sep 2025).
Adaptation: LLM prompts/configs enable rapid adaptation to new OS or application variants (e.g., Ubuntu ↔ CentOS, hardware profiles, directory trees) (Wang et al., 4 Jun 2024, Adebimpe et al., 24 Oct 2025).

Latency and Resource Overhead

Response latency: Hybrid dictionary+LLM systems achieve sub-second responses for common commands, 2–4 s for LLM-backed outputs. Larger models (≥3.8B parameters) incur higher memory and compute costs (Malhotra, 1 Sep 2025, Adebimpe et al., 24 Oct 2025).
Model fidelity: Moderate-sized models (1.5–3.8B) balance latency and fidelity (e.g., Gemini-2.0: BLEU 0.245, cosine similarity 0.405, latency ~3 s, hallucination rate 12.9%) (Malhotra, 1 Sep 2025).
Scalability: Local deployment of sub-12 B models (e.g., Llama-3.1 8B) is cost-effective and privacy-preserving compared to commercial APIs (Adebimpe et al., 24 Oct 2025).

Limitations

Context window exhaustion: Loss of long-term consistency as prompt size approaches model limits (Sladić et al., 8 Oct 2025, Wang et al., 4 Jun 2024).
Minor hallucinations: Out-of-distribution commands may elicit nonsensical or inconsistent outputs.
Interactive shell limitations: Lack of full-featured shell behavior (tab completion, history, inline editors) exposes some synthetic qualities (Adebimpe et al., 24 Oct 2025).
Adversarial prompt injection: Injection defenses remain an area of active research; prompt tuning and strict personality templates reduce but do not eliminate risk (Bridges et al., 29 Oct 2025).

6. Detection, Adversarial Robustness, and AI-Agent Monitoring

Emerging research targets the detection and forensics of LLM-driven attackers:

Prompt injection and time-based classification: LLM Agent Honeypot systems inject adversarial banners/instructions and measure sub-second response latencies to detect autonomous agent activity, with practical thresholds ( $\tau = 1.5$ s) and statistical models quantifying detection confidence (Reworr et al., 17 Oct 2024).
Honeypot-backdoor defense: In LLM model training, “honeypot modules” attached to lower-layer transformer representations trap backdoor signals and reduce attack success rates by 10–40 percentage points, with minimal impact on clean accuracy (Tang et al., 2023).
Adversarial limitations: Highly skilled attackers or interactive LLM agents may still uncover inconsistencies; adaptive red/blue team co-evolution is an open research direction (Bridges et al., 29 Oct 2025).

7. Evolution, Open Challenges, and Future Directions

LLM honeypots are rapidly evolving, with research directions including:

Autonomous self-improving deception: Integration with RAG, SIEM/SOAR platforms, and online reinforcement learning to adapt deception strategies based on novelty and threat intelligence gain (Bridges et al., 29 Oct 2025).
Hybrid architectures: Selective state-pruning, deterministic responders, and per-protocol LoRA/fine-tuning for efficiency and scaling (Sladić et al., 8 Oct 2025, Adebimpe et al., 24 Oct 2025).
Wider protocol adoption: Extension to SMTP, RADIUS, PowerShell, ICS, and multimodal deception (e.g., web artifact and honeytoken generation) (Jiménez-Román et al., 20 Sep 2025, Vasilatos et al., 9 May 2024).
Deception realism: Improved memory and context handling, dynamic file system and service emulation, and resistance to fine-grained fingerprinting remain critical (Sladić et al., 8 Oct 2025).
Threat landscape monitoring: Tracking the rise of LLM agent-powered attack automation and evolving detection strategies for differentiating between autonomous and human adversaries (Reworr et al., 17 Oct 2024).

The LLM honeypot paradigm is reshaping cyber-deception research through enhanced realism, protocol flexibility, and operational safety, forming the technical foundation for future autonomous, adaptive defense systems (Bridges et al., 29 Oct 2025, Sladić et al., 8 Oct 2025, Wang et al., 4 Jun 2024).