Genesis Framework: Autonomous Red-Teaming
- Genesis Framework is an autonomous, closed-loop red-teaming system that identifies and evolves adversarial strategies for web agents using genetic algorithms.
- It employs a cyclical process with Attacker, Scorer, and Strategist modules to continuously refine attack methods through evolutionary operations like crossover and mutation.
- The platform enhances attack success by building a growing, transferable strategy library that generalizes across tasks, environments, and various LLM backends.
Genesis is an autonomous, closed-loop red-teaming framework designed for black-box attacks on LLM-powered web agents. It consists of a cyclical three-module system—Attacker, Scorer, and Strategist—that continuously discovers, evaluates, and refines adversarial strategies for hijacking agent behavior via HTML injection. As web agent usage expands across high-risk domains, Genesis provides a systematic method for growing a reusable library of effective attack strategies, generalizing across differing tasks, environments, and underlying LLM backends (Zhang et al., 21 Oct 2025).
1. Genesis System Architecture
The Genesis framework interlinks three primary modules—Attacker, Scorer, and Strategist—anchored by a shared and continuously evolving Strategy Library. The overall system forms a closed feedback loop, mirroring evolutionary and human red-team workflows, as follows:
- Attacker: Pulls semantically relevant strategies from the Strategy Library (using embedding-based retrieval), and generates adversarial HTML injections by applying genetic operators (selection, crossover, mutation) to prior strategies.
- Web Agent: Executes the provided task in a modified web environment with the generated injection. Produces a multi-turn response trace.
- Scorer: Ingests the agent’s trace and the attack objective, assigning a scalar fitness value —where 10 indicates a perfect hijack (agent exactly executes the adversarial action ), and 1–9 grade nuanced influence using an LLM-based evaluator.
- Strategist: Reviews the task, adversarial HTML, injection, response trace, and score, distills novel high-level strategies (expressed in text or code), and appends them to the Strategy Library, indexed by their embedding for future retrieval.
The framework is designed so the strategy library grows over time, accumulating transferable attack techniques applicable across domains and agent implementations (Zhang et al., 21 Oct 2025).
2. Genetic Algorithm for Strategy Evolution
Genesis formalizes the Attacker’s methodology as a population-based genetic algorithm over candidate strategies . Each strategy may be a natural-language template or a code snippet parameterized by a canonical adversarial placeholder.
Population, Fitness, and Selection
- Fitness Function: For each injected strategy , the Scorer yields :
- if the agent’s final action .
- 0 where 1 is an LLM-based nuanced grader, for partial success (2).
- Population Partition:
- 3 (successful, 4)
- 5 (unsuccessful)
Genetic Operators
- Crossover: Samples pairs from 6; the operator 7, instantiated by an LLM, merges strategy principles (e.g., combining authoritative framing from one parent with multilingual obfuscation from another or interleaving code).
- Mutation: Samples individuals from 8; 9, also LLM-driven, randomly perturbs the parent to explore new regions in strategy space.
The next population is constructed by selecting the top-m strategies according to fitness from all offspring and non-parent survivors. In practice, Genesis drops the fixed-size constraint and instead uses all retrieved and newly generated strategies as few-shot exemplars when prompting the Attacker’s LLM (Zhang et al., 21 Oct 2025).
3. Hybrid Strategy Representation and Library
Each strategy in Genesis is stored in one of two mutually compatible formats:
- Text-based: Concise, natural-language principles (e.g., "Embed a high-priority system override instruction in aria-label using mixed English/Chinese to exploit multilingual parsing.").
- Code-based: Executable Python functions (e.g.,
def refine(payload): return "[[[" + payload + "]]]").
Both modes are tokenized and embedded as plain text (using text-embedding-3-small). Mutation and crossover act uniformly on these representations, relying on the structural capabilities of LLMs to maintain coherence during genetic recombination.
The Strategy Library thus becomes a searchable archive of both high-level principles and concrete attack recipes, indexed in embedding space for effective semantic retrieval by future attacker iterations (Zhang et al., 21 Oct 2025).
4. Scoring, Strategy Synthesis, and Library Update
The Scorer module employs a two-stage scoring process:
- Deterministic Check: If 0, assign 1.
- Nuanced LLM Grading: If not, package the task, 2, and the agent’s complete response trace as a prompt to an LLM, which returns a fitness 3 reflecting degree of agent control.
The Strategist is prompted on every attack with 4, 5, 6, refinement code, 7, and 8. It:
- Analyzes the log for core exploitation principles.
- Chooses output modality (text or code strategy).
- Outputs the new strategy with a brief rationale.
- Generates and indexes the new embedding.
- Appends the strategy, embedding, and score to the library.
Future Attacker queries retrieve by cosine similarity, ensuring contextual relevance for unseen tasks within or across domains (Zhang et al., 21 Oct 2025).
5. Experimental Evaluation
Genesis was evaluated on 600 tasks from four high-risk Mind2Web domains (Finance, Medical, Housing, Cooking). The primary metric is Attack Success Rate (ASR, pass@10): fraction of tasks where the target agent performs 9 within 10 attempts. Baselines included GCG, I-GCG, AgentAttack, InjecAttack, EIA, and AdvAgent. Targeted systems were SeeAct and WebExperT, each instantiated with GPT-4o, Gemini-2.5-Flash, or GPT-5.
Main findings:
| Method | ASR (SeeAct, GPT-4o) |
|---|---|
| AdvAgent | 43.6% |
| Genesis (no pre-init) | 44.5% |
| Genesis (full) | 53.0% |
Genesis achieves a 9–10 percentage point improvement over the strongest baseline on average. Library transfer across backends yields minimal performance drop (e.g., transferring strategies from GPT-5 to GPT-4o yields 55.3% ASR).
Ablation studies confirm:
- Removing the Strategist (no library) drops ASR to 29.9% (–23 ppt).
- Removing the Scorer module reduces ASR to ~35% (–18 ppt).
- Hybrid representation (text + code) outperforms text-only and code-only variants (Zhang et al., 21 Oct 2025).
6. Algorithmic Pseudocode
The main procedure is codified via the following high-level pseudocode in LaTeX notation:
0
This captures retrieval, genetic operations, attack generation, response evaluation, and strategy synthesis in each red-teaming episode (Zhang et al., 21 Oct 2025).
7. Rationale and Impact
Genesis’s effectiveness derives from its cyclic architecture:
- The Scorer quantifies both complete and partial attack progress, facilitating continuous feedback.
- Genetic operators drive exploration and exploitation within the evolving population of strategies.
- The Strategist composes distilled, context-sensitive attack principles, extending the strategy library for persistent transfer and reuse.
This approach emulates skilled human red-teamers and supports rapid, automated discovery of novel and transferable web agent vulnerabilities (Zhang et al., 21 Oct 2025). A plausible implication is that similar agentic frameworks could generalize to non-web LLM domains that benefit from strategy evolution and knowledge archiving.