Papers
Topics
Authors
Recent
2000 character limit reached

Memory-Guided Attack Selection

Updated 19 December 2025
  • Memory-guided attack selection is a paradigm that dynamically chooses attack vectors based on historical outcomes, vulnerability profiles, and contextual state.
  • It integrates methods like LLM red teaming with memory tables, DRAM statistical profiling, and web agent context manipulation to rank and select high-impact attack strategies.
  • Empirical studies demonstrate that memory-guided approaches boost attack success rates and reduce resource costs compared to conventional, stateless methods.

Memory-guided attack selection is an emergent paradigm in adversarial security research, in which attack vectors and strategies are chosen dynamically based on attackers’ accumulated knowledge of a target system’s behavior, prior successes, and context-specific vulnerabilities documented in memory structures. This principle underpins recent advances across autonomous red teaming for LLMs, DRAM probing attacks, and web agent context manipulation. By leveraging detailed performance histories, susceptibility profiles, or compromised execution state, adversaries optimize attack sequences in real-time, often outmaneuvering static defenses that ignore the adaptive role of memory. Empirical studies demonstrate that memory-guided selection significantly amplifies attack impact and efficiency compared to stateless, randomized approaches.

1. Conceptual Foundation of Memory-Guided Attack Selection

Memory-guided attack selection denotes mechanisms by which an adversary, automated agent, or attack framework deploys specific attacks “guided” by records of prior attempts, dynamic profiling, or contextual knowledge persisted in memory. Critical distinctions arise between stateless selection (random or fixed attack sequences, independent of history) and memory-guided approaches, where selection is shaped by empirical statistics, similarity embeddings, vulnerability profiles, or poisoned agent state.

In “AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration,” the memory-guided selection process is formalized through centralized memory tables tracking attack outcomes, query costs, and prompt embeddings. Attack candidates are ranked by success rates, cost efficiency, and novelty scores drawn from these persistent memories, enabling exploitation of high-yield vectors and exploration of under-tested candidates (Zhou et al., 20 Mar 2025).

In “FAULT+PROBE: A Generic Rowhammer-based Bit Recovery Attack,” memory-guided selection is realized via offline DRAM profiling: attacker identifies “reliable” bit offsets—those consistently vulnerable to directional flipping—then selects attack targets accordingly in the online phase, using statistical fault analysis to maximize leakage (Derya et al., 11 Jun 2024).

Context manipulation attacks on web agents achieve memory-guided selection by poisoning their external or session memory, hijacking context to launch plan injection or context-chained sequences that maximize plausibility and attack success rates (Patlan et al., 18 Jun 2025).

2. Memory Architectures and Structures in Attack Pipelines

Different attack domains implement memory-guided selection via tailored memory architectures:

  • AutoRedTeamer maintains:
    • Case Memory: Stores embeddings of previously seen test cases, sequences of successful attacks, and outcome flags.
    • Attack Metrics Memory: Tracks success/failure counts and cumulative query cost for each atomic and composite attack in the Attack Library LL.
    • Combo Memory: Aggregates statistics for previously tried attack combinations.
  • FAULT+PROBE constructs:
    • Vulnerability Profiles: Arrays of directional flip counts for every bit in a DRAM page (32,768 bits per page). Each entry records total flips per direction (0→1, 1→0) across hundreds of hammering trials.
  • Context Manipulation Attacks depend on:
    • Agent External Memory: Task plan sequence (P_i), interaction history (h_{i,t}), and third-party/conversation logs, which persist across sessions and are read into agent context for reasoning.

These memories enable fine-grained decision processes—either for adaptively picking high-impact attack vectors or corrupting future reasoning by hijacking state (Zhou et al., 20 Mar 2025, Derya et al., 11 Jun 2024, Patlan et al., 18 Jun 2025).

3. Attack Scoring, Candidate Ranking, and Selection Algorithms

Memory-guided selection relies on quantifiable ranking of attack candidates:

  • AutoRedTeamer’s Strategy Designer computes selection scores per attack (atomic or combo) as:

Score(x)=αs^xβcˉx+γnovelty(x)\mathrm{Score}(x) = \alpha\,\hat{s}_x - \beta\,\bar{c}_x + \gamma\,\mathrm{novelty}(x)

where s^x\hat{s}_x is empirical success rate, cˉx\bar{c}_x is average cost, and novelty(x)\mathrm{novelty}(x) measures attack infrequency. Cosine similarity between prompt embeddings identifies relevant prior cases and thus promising attack candidates.

  • FAULT+PROBE ranks DRAM bit-offsets by flip probability:

Pi=max(Pi01,Pi10)P_i = \max(P_i^{0→1}, P_i^{1→0})

Or, with background noise discounted:

S(i)=max(Pi01,Pi10)1N1ji(Pj01+Pj10)/2S(i) = \max(P_i^{0→1},P_i^{1→0}) - \frac{1}{N-1} \sum_{j\neq i} \left( P_j^{0→1} + P_j^{1→0} \right)/2

The attacker selects top-k offsets as reliable targets for online bit probing.

  • Web Agent Plan Injection and Context-Chained Injection depend on payload construction that maximizes semantic continuity (context chaining) between user task UU, intermediate II, and attacker goal AA, ensuring the corrupted memory is interpreted as a plausible plan in execution.

The following table organizes selection metrics per domain:

Domain Memory Element Selection Metric
LLM Red Teaming Attack/Case Memory Score(xx), Cosine(epe_p, eie_i)
DRAM Bit Recovery Vulnerability Profile S(i)S(i), PiP_i
Web Agents (Plan Injection) External Context/Plan Semantic alignment, context chaining

4. Empirical Evaluation and Impact of Memory-Guided Selection

Quantitative evaluations reveal substantial impact:

  • AutoRedTeamer (Zhou et al., 20 Mar 2025):
    • Full memory guidance achieves ASR = 0.69 vs 0.43 (no-memory) and 0.12 (random), with 30–40% reduction in queries and tokens.
    • On HarmBench, outperforms best baseline by 20 points in ASR; computational cost down by ~46%.
  • FAULT+PROBE (Derya et al., 11 Jun 2024):
    • Reliable profiling yields 685 “flippy” pages per 340MB DRAM.
    • Average key-bit recovery rate: 22 bits/hour (2.7 min/bit), 100% correctness across all 256 bits.
    • Flip detection rates for reliable offsets: Pi(d)0.20.4P_i(d) \approx 0.2-0.4; for others, <0.01<0.01.
  • Context Manipulation Attacks (Patlan et al., 18 Jun 2025):
    • Plan injection (task-aligned) achieves 94.7% ASR (opinion steering), 18.7% (factual manipulation), 78.7% (ads), 0% non-contextual.
    • Context-chained privacy exfiltration ASR = 53.3%, task-aligned = 35.6%, non-contextual = 0%. Context chaining boosts ASR by ≈ 17.7% absolute over task-aligned.
    • With defenses (Sandwich, Secure Prompt): plan injection retains ASR ≈ 46–63%, while prompt injection success drops below 10–25%.

A plausible implication is that memory-guided selection not only maximizes immediate attack success, but also enables adaptive, lifelong exploitation of evolving system weaknesses.

5. Attack Pipelines and Memory Update Dynamics

  • AutoRedTeamer (Zhou et al., 20 Mar 2025):
    • Initialization: empty attack/case/combo memories.
    • For each test case, embedding computed, top-K similar cases selected, attacks ranked by score.
    • Attack applied; judge evaluates response; memory updated:
    • Atomic: na+,na,can_a^+, n_a^-, c_a incremented.
    • Combo: analogous update.
    • Case embedding: EMA update with new post-attack embedding.
    • Seed regeneration if case relevance drifts.

Pseudocode (core loop excerpt):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Initialize Memory: M_atk  , M_combo  , M_case  
for iteration = 1T:
    for case p in P:
        e_p  Embed(p)
        S  top-K similar from M_case
        C  candidate attacks
        for x in C:
            Score[x]  α·exploit  β·cost + γ·novel
        x*  argmax Score[x]
        p' ← ApplyAttack(p, x*)
        J  Judge(LLM(p'))
        UpdateMemory(...)
        if J == 1: record success
        if p' drifts: regenerate seed

  • FAULT+PROBE (Derya et al., 11 Jun 2024):
    • Offline profiling: repeated hammering, statistical logging of flips per bit offset.
    • Attack target selection: sort by S(i)S(i), choose offsets for online probing.
    • Online phase: hammer chosen bits, invoke victim, record observable failure rates, deduce bit values against profile predictions.
  • Web Agent Attacks (Patlan et al., 18 Jun 2025):
    • Memory compromise either direct (write access) or indirect (prompt injection harvested to memory).
    • Payload construction: task-aligned, context-chained, or non-contextual.
    • Injection: δP\delta_P inserted into PiP_i; plan re-parsed as genuine; execution agent carries out attack.

Memory update rules promote not just exploitation of previously learned weaknesses but also flexible adaptation—by updating success statistics, embeddings, and context logic as attacks succeed or fail.

6. Limitations, Defenses, and Future Directions

  • Limitations:
    • Memory-guided selection presupposes feasible persistence, access, and update of attack-relevant statistics or profiles.
    • FAULT+PROBE’s effectiveness is contingent on the presence of side-channels, reliable DRAM pages, and co-location for hammering.
    • Web agent memory attacks depend on untrusted client-side or third-party storage and absence of cryptographic verification.
  • Defenses:
    • Semantic validation (contrastive learning, plan consistency checks) can detect context poisoning but may be brittle to paraphrasing (Patlan et al., 18 Jun 2025).
    • Memory integrity systems (Merkle trees, secure enclaves) complicate or block unauthorized memory alteration (Patlan et al., 18 Jun 2025).
    • Hardware-level mitigations against Rowhammer (TRR, ECC) or software-level constant-time design eliminate observable side-channel differences (Derya et al., 11 Jun 2024).
  • Outlook:
    • Integrating tamper-evident, cryptographically verified memory and adaptive semantic screening is critical for robust defense against memory-guided selection.
    • Lifelong attack integration, as exemplified in autonomous red teaming, will stabilize memory-based frameworks as the primary paradigm for scalable, comprehensive adversarial testing (Zhou et al., 20 Mar 2025).

7. Cross-Domain Significance and Research Trajectory

Memory-guided attack selection is now central in adversarial technique portfolios for LLM red teaming, hardware fault exploitation, and agentic workflow subversion. Its efficacy owes to structured experience—empirical, statistical, or contextually hijacked—guiding selection and adaptation of attack strategies. Across domains, memory-driven selection mechanisms yield higher attack success rates, superior resource efficiency, and persistent circumvention of static defenses.

The paradigm’s development by Patlan et al. for context manipulation (Patlan et al., 18 Jun 2025), by Pustogarov et al. in statistical DRAM profiling (Derya et al., 11 Jun 2024), and by Sun et al. for autonomous lifelong red teaming (Zhou et al., 20 Mar 2025) reflects how adversarial research grounds attack selection in systematically accumulated, memory-based knowledge, underscoring the need for robust memory integrity and semantic filtering at all layers of complex intelligent systems.

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Memory-Guided Attack Selection.