EAHawk Pipeline: Email Agent Hijacking Analysis
- EAHawk Pipeline is a fully automated security evaluation framework that identifies vulnerable LLM-driven email agents and confirms hijackability through dynamic testing.
- It employs a modular workflow with static analysis, LLM-assisted attack prompt generation, and real-time oracle testing to simulate Email Agent Hijacking scenarios.
- Empirical tests on 1,404 instances revealed universal hijackability with an average of just 2.03 attack attempts per instance, underscoring significant security risks.
The EAHawk pipeline is a comprehensive, fully automated, large-scale security evaluation framework designed to assess the susceptibility of LLM-driven email agent applications to Email Agent Hijacking (EAH) attacks. EAHawk systematically identifies target agents, synthesizes attack prompts using LLM rewriting techniques, and confirms hijackability through dynamic testing and an operational oracle. Its development was motivated by the observation that LLM-augmented email agents—which integrate external email service APIs for autonomous mail management—pose systemic, underexplored attack surfaces, particularly for prompt-injection-based hijacking. In an empirical study, EAHawk was used to evaluate 1,404 real-world LLM email agent instances, demonstrating a universal hijackability with an average of just 2.03 attack attempts per instance (Wu et al., 3 Jul 2025).
1. Architectural Composition and Workflow
EAHawk is composed of three sequential modules executed in an end-to-end pipeline:
- Email Agent Identification (Static Analysis): Crawls the source trees of candidate repositories to flag the presence of email-related imports or external API calls, such as
imaplibandsmtplib.SMTP_SSL. - Attack Prompt Generation (LLM-Assisted Synthesis): For each email-capable agent, generates a set of candidate hijacking prompts by programmatic rewriting of an override template, orchestrated via LLMs (specifically DeepSeek-R1) to preserve malicious intent and invocation of critical mail-handling primitives.
- Email Agent Hijacking Confirmation (Dynamic Oracle Testing): Each synthesized prompt is delivered to a sandboxed victim instance. A monitoring oracle observes the execution of a four-primitive sequence—search_email, retrieve_email, create_draft, send_email—and early-exits once malicious control is established.
This modular decomposition enables a fully automated feed-forward workflow:
2. Formal Definitions and Core Metrics
Let denote the total number of attack attempts and the number of successful agent hijacks (where attack success is measured by the complete execution of the attacker's primitive sequence). The success probability is thus
For agent instances, with as the index of the first successful attack on instance :
For a specific primitive , success can be tracked as
where counts the attack prompts targeting 0 and 1 denotes those where 2 was successfully executed (Wu et al., 3 Jul 2025).
3. Modular Pipeline: Pseudocode and Process Details
- Module 1: Email Agent Identification
- Input: Set of repositories 3.
- Process: For each repository, all source files are scanned for matches to a catalog of email-relevant imports and API calls (see Table A.2 in (Wu et al., 3 Jul 2025)).
- Output: 4, the set of discovered email agents.
- Module 2: Attack Prompt Generation
- Input: Agent 5.
- Process: Starting from a two-step "override template" (fake system prompt plus deceptive user prompt), 6 distinct prompt variants are generated via LLM rewriting, each preserving override instruction, attack scenario, and at least one instance of each email primitive.
- Output: 7, a set of attack prompts (with 8 per instance).
- Module 3: Email Agent Hijacking Confirmation
- Input: Agent 9, prompt set 0.
- Process: Each prompt 1 is sent to a sandboxed victim account running the agent. After a fixed processing interval, the oracle observes for execution of search→retrieve→draft→send. Success is immediately recorded upon detection.
A sketch:
4. Large-Scale Experimental Implementation
EAHawk's deployment encompassed:
- Agent Frameworks: 14 prominent LLM-oriented app frameworks (e.g., LangChain, Llama_index, Griptape).
- Agent Apps: 63 open-source applications (43 web-based, 20 local).
- Email Services: 20 providers, including Gmail, Outlook, QQ, and Yahoo.
- LLMs: 12 models (GPT-3.5, GPT-4, DeepSeek-V3, DeepSeek-R1, Gemini-1.5/2.0, Claude 3.* series, Llama 3.* series).
Module 1 identified 4 frameworks and 22 applications actually employing email APIs, yielding 117 unique agent + service combinations. Each was paired with all 12 LLMs, resulting in 1,404 isolated test instances. Each instance was hosted individually (on i9-13900K/128GB RAM Windows hosts or M1/16GB MacBooks as needed for Llama models) to guarantee uncontaminated evaluation and local LLM operability (Wu et al., 3 Jul 2025).
5. Statistical Outcomes and Attack Effectiveness
For 2 tested instances, with observed prompt indices 3 for first hijack:
- Average attempts per hijack: 4
- Universal hijackability: 5 (all instances hijacked)
- Success probability for any attempt: 6 (trivially maximal since all succeeded)
- For some LLMs, 7 dropped as low as 1.23
- No advanced statistical testing—empirical effects were conclusive (100%)
Parameterization: 8 attack prompts per instance (10 per primitive), 5-minute post-delivery evaluation delay, controlled sandbox accounts for both attacker and victim roles (Wu et al., 3 Jul 2025).
6. Representative Attack Scenario
The “Privacy Leakage” use case exemplifies attack operation:
- Victim agent—driven by GPT-4—linked to Gmail.
- Attacker sends a poisoned email comprising:
- Fake "system prompt" block.
- Deceptive user prompt (e.g., instruction to exfiltrate payment data to attacker address).
- Upon polling new mail, the agent ingests the malicious prompt, triggering:
- search_email(from="[email protected]")
- retrieve_email(msg_id)
- create_draft(to="[email protected]", body=<payment data>)
- send_email(draft_id)
- Victim receives no indication of compromise; requested user functionality completes as normal.
This demonstrates both the stealth and the potency of the EAH attack vector (Wu et al., 3 Jul 2025).
7. Significance and Limitations
EAHawk revealed a systematic and immediate threat posed by prompt-injection exploits against LLM-driven email agents. The pipeline’s automation, corpus coverage, and modular framework enabled the first empirical quantification of this vulnerability class—demonstrating that every evaluated instance was remotely hijackable in practice with minimal effort.
Limitations of the current EAHawk pipeline include the possibility of unflagged email-agent code (if email functionality is obfuscated or noncanonical), and its focus on the four defined primitives without broader behavioral monitoring. This suggests further work is required to address nonstandard agent architectures or adversary models.
The results underscore urgent needs for agent-aware isolation, context sanitization, and tighter prompt provenance verification in LLM-integrated applications, especially those with privileged API access (Wu et al., 3 Jul 2025).