Papers
Topics
Authors
Recent
2000 character limit reached

LLM Honeypots: Dynamic Decoy Systems

Updated 15 December 2025
  • LLM honeypots are decoy systems that use large language models to generate context-aware, dynamic responses for simulating genuine system interactions.
  • They integrate protocol listeners, prompt management, and stateful session maintenance to convincingly emulate services like SSH, databases, and industrial controls.
  • Evaluations show high deception rates and improved threat intelligence while minimizing operational risks by avoiding real command executions.

A LLM honeypot is an interactive decoy system in which all service or system responses are synthesized dynamically by an LLM, rather than emulated using traditional templates or rule-based logic. These systems are designed to increase the realism, flexibility, and operational safety of honeypots by leveraging the contextual understanding and generative capabilities of LLMs. LLM honeypots have been applied in domains ranging from SSH shells and databases to industrial control protocols and identity services. The field is characterized by rapid methodological evolution in prompt engineering, protocol adaptation, hybrid modeling, and adversarial robustness, with ongoing research focused on maximizing attack engagement and threat intelligence while minimizing operational risks.

1. Definitions, Motivation, and Core Distinctions

An LLM honeypot is a decoy service that uses an LLM to generate context-aware, realistic, and adaptive replies to attacker probes or sessions. This approach is formalized as y=Gθ(Prompt(x,context),z)y_\ell = G_\theta(\text{Prompt}(x,\,\text{context}),\,z), where GθG_\theta is the LLM and “Prompt(·)” synthesizes the current input with the running context and injected deception policies (Bridges et al., 29 Oct 2025).

Key distinctions from traditional honeypots include:

  • Interaction Depth: Dynamic, stateful outputs can reflect extensive history and synthetic environments (filesystem, database, service banners).
  • Protocol and Role Flexibility: With a single model and prompt, multiple protocols (SSH, MySQL, POP3, HTTP) can be simulated at high realism (Sladić et al., 8 Oct 2025).
  • Operational Safety: Because no real system actions are performed—only LLM inference—there is no risk of command execution, shell breakout, or data exfiltration from production assets.
  • Deception Realism: LLMs can synthesize responses for edge-case or previously unseen inputs, reducing deterministic fingerprints of legacy honeypots.

Motivations for the adoption of LLM-based deception systems include an improved defense against skilled, adaptive adversaries, richer intelligence gathering on attack tactics, and the ability to safely scale honeypot deployments (Bridges et al., 29 Oct 2025, Sladić et al., 8 Oct 2025).

2. System Architectures and Design Variants

Canonical Pipeline Components

Systemic patterns across LLM honeypots combine protocol listeners, prompt managers, LLM engines, and logging modules (Sladić et al., 8 Oct 2025, Bridges et al., 29 Oct 2025):

  • Network protocol front-end: A Python wrapper or customized server listens on real service ports (e.g., SSH, LDAP, HTTP), parses incoming connections, and authenticates users.
  • Prompt management and personality injection: Each protocol is associated with a system prompt that encodes its response style, legal commands, forbidden information leaks, and output formatting. Pre-session history and per-command exchanges are maintained as part of prompt state to ensure interaction continuity.
  • LLM configuration: Can be fine-tuned (e.g., GPT-3.5-16k for shell simulation (Sladić et al., 8 Oct 2025)) or use zero/few-shot prompting with models such as GPT-4, Llama-3, or local LLMs (Llama, Gemma, Qwen, ByT5) (Malhotra, 1 Sep 2025, Adebimpe et al., 24 Oct 2025, Wang et al., 4 Jun 2024).
  • Session state and persistence: Many systems persist session history and attacker state across reconnects to maintain realism and prevent stateless response artifacts.
  • Output handling and logging: LLM outputs are formatted per protocol conventions, and all I/O is captured for threat intelligence and forensic analysis.

Hybrid and Domain-Specific Architectures

Variants integrate hybrid approaches:

3. Prompt Engineering and Memory Management

High-fidelity deception depends on prompt design and history management:

  • Protocol personality templates encode strict behavioral expectations (“You can never reveal that you are not a real database client… Always end your output with ‘mysql>’.”) (Sladić et al., 8 Oct 2025).
  • Chain-of-thought (CoT) instructions force the LLM to reason step-by-step about the validity, state effects, and output format of each command (Wang et al., 4 Jun 2024, Sladić et al., 2023).
  • Worked examples and negative prompts help the model reject invalid inputs and defend against prompt injection.
  • Stateful history injection through pruned buffers or score-weighted entry removal enables multi-turn coherence without exceeding context windows. HoneyGPT, for example, prunes commands with low decayed impact scores when the token budget is exceeded: dropped when Tstatic+Tdynamic>CmaxT_\text{static} + T_\text{dynamic} > C_\text{max} (Wang et al., 4 Jun 2024).
  • Explicit denial logic and output-guard constraints limit LLM exposure to adversarial queries.

4. Evaluation Methodologies and Empirical Findings

Evaluation encompasses generative fidelity, deception success, engagement metrics, and operational risk.

Generative Fidelity

Unit tests benchmark LLM outputs against ground truth using:

  • Passing rate: passing_rate=tests passedtotal tests×100%passing\_rate = \frac{\textrm{tests passed}}{\textrm{total tests}} \times 100\% across protocol-specific command suites (Sladić et al., 8 Oct 2025).
  • String and semantic similarity metrics: Normalized Levenshtein, cosine similarity, Jaro-Winkler, BLEU-4, SBERT/BERTScore scores (Malhotra, 1 Sep 2025, Adebimpe et al., 24 Oct 2025, Wang et al., 4 Jun 2024).
  • Response validity: Structural/semantic checks for correct protocol behavior (e.g., 99–100% response-validity accuracy for LLMPot under Modbus/S7Comm (Vasilatos et al., 9 May 2024)).

Deception and Engagement

  • Human deception studies: Attackers (n=89) interacting with LLM honeypots confused with real systems at rates of \sim30% deception rate (Sladić et al., 8 Oct 2025). Fine-tuned honeypots such as shelLM achieve \sim0.92 overall accuracy in plausibly simulating shell behavior, with \sim0.91 true negative rate (fooling the user) (Sladić et al., 2023).
  • Mean session length: LLM honeypots extend engagement time by 55–60% over baseline (e.g., 72 vs. 45 mean commands in HoneyGPT vs. Cowrie) (Wang et al., 4 Jun 2024).
  • Novel TTP discovery: LLM honeypots capture unobserved attack vectors (11 new ATT&CK actions in HoneyGPT deployments) (Wang et al., 4 Jun 2024).

Operational Risk

As no commands are executed on a live OS, RoR_o (operational risk) is minimized, with Ro=αPescape+βEimpactR_o = \alpha P_{\mathrm{escape}} + \beta E_{\mathrm{impact}} typically negligible under current designs (Bridges et al., 29 Oct 2025).

5. Model and System Performance: Protocols, Latency, and Limitations

Protocol Breadth

Latency and Resource Overhead

  • Response latency: Hybrid dictionary+LLM systems achieve sub-second responses for common commands, 2–4 s for LLM-backed outputs. Larger models (≥3.8B parameters) incur higher memory and compute costs (Malhotra, 1 Sep 2025, Adebimpe et al., 24 Oct 2025).
  • Model fidelity: Moderate-sized models (1.5–3.8B) balance latency and fidelity (e.g., Gemini-2.0: BLEU 0.245, cosine similarity 0.405, latency ~3 s, hallucination rate 12.9%) (Malhotra, 1 Sep 2025).
  • Scalability: Local deployment of sub-12 B models (e.g., Llama-3.1 8B) is cost-effective and privacy-preserving compared to commercial APIs (Adebimpe et al., 24 Oct 2025).

Limitations

  • Context window exhaustion: Loss of long-term consistency as prompt size approaches model limits (Sladić et al., 8 Oct 2025, Wang et al., 4 Jun 2024).
  • Minor hallucinations: Out-of-distribution commands may elicit nonsensical or inconsistent outputs.
  • Interactive shell limitations: Lack of full-featured shell behavior (tab completion, history, inline editors) exposes some synthetic qualities (Adebimpe et al., 24 Oct 2025).
  • Adversarial prompt injection: Injection defenses remain an area of active research; prompt tuning and strict personality templates reduce but do not eliminate risk (Bridges et al., 29 Oct 2025).

6. Detection, Adversarial Robustness, and AI-Agent Monitoring

Emerging research targets the detection and forensics of LLM-driven attackers:

  • Prompt injection and time-based classification: LLM Agent Honeypot systems inject adversarial banners/instructions and measure sub-second response latencies to detect autonomous agent activity, with practical thresholds (τ=1.5\tau = 1.5 s) and statistical models quantifying detection confidence (Reworr et al., 17 Oct 2024).
  • Honeypot-backdoor defense: In LLM model training, “honeypot modules” attached to lower-layer transformer representations trap backdoor signals and reduce attack success rates by 10–40 percentage points, with minimal impact on clean accuracy (Tang et al., 2023).
  • Adversarial limitations: Highly skilled attackers or interactive LLM agents may still uncover inconsistencies; adaptive red/blue team co-evolution is an open research direction (Bridges et al., 29 Oct 2025).

7. Evolution, Open Challenges, and Future Directions

LLM honeypots are rapidly evolving, with research directions including:

The LLM honeypot paradigm is reshaping cyber-deception research through enhanced realism, protocol flexibility, and operational safety, forming the technical foundation for future autonomous, adaptive defense systems (Bridges et al., 29 Oct 2025, Sladić et al., 8 Oct 2025, Wang et al., 4 Jun 2024).

Whiteboard

Follow Topic

Get notified by email when new papers are published related to LLM Honeypots.