Papers
Topics
Authors
Recent
Search
2000 character limit reached

RAGCRAWLER: Adaptive Attacks on RAG Systems

Updated 29 January 2026
  • RAGCRAWLER is a knowledge graph-based adaptive crawler attack on RAG systems that maximizes corpus coverage through principled multi-turn querying.
  • It employs an Adaptive Stochastic Coverage Problem formulation to guide query planning with theoretical guarantees and strong experimental performance in coverage and semantic fidelity.
  • The method leverages evolving knowledge graphs and upper confidence bound strategies to balance exploration and exploitation while evading security defenses.

RAGCRAWLER is a knowledge graph–guided adaptive crawler attack on retrieval-augmented generation (RAG) systems, designed to exfiltrate a maximal portion of a confidential corpus through principled, stealthy, and efficient multi-turn querying. By formulating the extraction procedure as an Adaptive Stochastic Coverage Problem (ASCP), RAGCRAWLER supersedes purely heuristic, locally greedy strategies, providing strong theoretical guarantees on extraction efficiency and demonstrating broad practical effectiveness and robustness against existing security defenses (Yao et al., 22 Jan 2026).

1. Threat Model and Formalism

RAGCRAWLER targets black-box RAG services S=(R,G,D)\mathcal{S} = (\mathcal{R}, \mathcal{G}, \mathcal{D}), where R\mathcal{R} is a document retriever, G\mathcal{G} is a LLM generator, and D\mathcal{D} is a confidential document corpus. An adversarial user issues a series of innocuous-appearing natural language queries {q1,,qB}\{q_1, \ldots, q_B\} and receives only generated answers at=G(wrapper(qt,R(qt)))a_t = \mathcal{G}(\text{wrapper}(q_t, \mathcal{R}(q_t))), with no access to retrieval scores or document IDs. The adversary's objective is to maximize the fraction of unique documents surfaced from D\mathcal{D} within a fixed query budget BB:

max1Dt=1BDk(qt),subject to all qt innocuous.\max \frac{1}{|\mathcal{D}|} \left| \bigcup_{t=1}^B \mathcal{D}_k(q_t) \right|, \quad \text{subject to all } q_t \text{ innocuous}.

Existing multi-turn extraction attacks such as RAGThief and IKEA employ local heuristics and are prone to semantic drift or redundant exploration, lacking principled long-term coverage planning and offering no formal performance guarantees. RAGCRAWLER is constructed to address these fundamental shortcomings (Yao et al., 22 Jan 2026).

2. Framing as Adaptive Stochastic Coverage Problem

RAG crawling is rigorously modeled as an ASCP, capturing the sequential, uncertainty-laden nature of the attack:

  • Universe: U=D\mathcal{U} = \mathcal{D} (documents)
  • Action space: Q\mathcal{Q} (all possible queries)
  • Random outcome: Φ(q)D\Phi(q) \subseteq \mathcal{D}, the kk documents retrieved by qq
  • Objective: Maximize expected corpus coverage with coverage function f(S,Φ)=qSΦ(q)/Df(S, \Phi) = |\bigcup_{q\in S} \Phi(q)| / |\mathcal{D}|
  • Conditional Marginal Gain (CMG): For candidate qq given past queries/answers ψt1\psi_{t-1},

Δ(qψt1)=EΦ[f(St1{q},Φ)f(St1,Φ)ψt1].\Delta(q \mid \psi_{t-1}) = \mathbb{E}_{\Phi}\left[ f(S_{t-1} \cup \{q\}, \Phi) - f(S_{t-1}, \Phi) \mid \psi_{t-1} \right].

Adaptive submodularity supports that a sequential greedy policy achieves a (11/e)(1-1/e)-approximation to optimal expected coverage. Challenges include unobservability of Δ(qψ)\Delta(q \mid \psi), intractability of the infinite action space Q\mathcal{Q}, and robustness to query rewriting/paraphrasing in real-world systems (Yao et al., 22 Jan 2026).

3. Knowledge Graph Construction and Attacker State

At each turn, RAGCRAWLER maintains an attacker-local, evolving knowledge graph Gt=(Et,R,Lt)\mathcal{G}_t = (\mathcal{E}_t, \mathcal{R}, \mathcal{L}_t):

  • Entities (Et\mathcal{E}_t): Canonicalized mentions (e.g., “Patient A”, “symptom”).
  • Relations (R\mathcal{R}): Domain-specific types (e.g., “has_symptom”, “treated_by”).
  • Edges (Lt\mathcal{L}_t): Observed entity-relation-entity triples.

After receiving ata_t, a two-pass LLM-based extraction-reflection phase yields a subgraph ΔGt\Delta\mathcal{G}_t. Entities and relations are semantically merged using embedded cosine similarity to avoid surface-form redundancy, so that increments in Gt\mathcal{G}_t correlate with genuine corpus coverage increase. This structure enables the estimation of CMG by tracking growth in the knowledge graph rather than directly observing retrieved documents (Yao et al., 22 Jan 2026).

4. CMG Estimation and Query Planning in Semantic Space

RAGCRAWLER’s query planning involves two main components:

  1. Entity Selection via UCB and Graph Priors: For each eEte \in \mathcal{E}_t, empirical payoffs are computed based on novel node/edge discovery; an upper confidence bound encourages both exploitation and exploration, augmented by structural graph scores (DegreeScore, AdjScore) reflecting local connectivity and under-explored relations. Entities are sampled from a Top-KK softmax to preserve diversity.
  2. Relation Selection: Given ee^*, the scheduler picks relation rr^* as the maximizer of the “deficit” score among unexplored relations, provided the potential information gain is significant.

Queries are generated from (e,r)(e^*, r^*) using LLM-driven templates, and history-aware deduplication prevents semantic redundancy by enforcing a high threshold of dissimilarity with previous queries. All embedding computations employ pre-trained sentence transformers.

The sequential workflow is efficiently implemented, with per-query runtime dominated by at most three LLM calls and amortized complexity O(Et+Lt)O(|\mathcal{E}_t| + |\mathcal{L}_t|), thanks to caching and lazy updates post-graph expansion (Yao et al., 22 Jan 2026).

5. Experimental Results and Evaluation Metrics

RAGCRAWLER is evaluated on four benchmarks: TREC-COVID, SciDocs, NFCorpus, and Healthcare-Magic, with datasets of 1,000 sampled documents. Victim RAGs instantiate two bi-encoder retrievers (BGE, GTE) and four LLM generators (Llama-3-8B, Command-R-7B, Microsoft Phi-4, GPT-4o-mini with guardrails), including variants with query rewriting and multi-query retrieval.

The following table summarizes core results for coverage rate (CR) and semantic fidelity (SF), comparing RAGCRAWLER to RAGThief and IKEA:

Dataset / Generator Metric RAGThief IKEA RAGCRAWLER
TREC-COVID / Llama-3-8B CR 0.131 0.161 0.494
SF 0.447 0.495 0.591
SciDocs / Command-R CR 0.093 0.522 0.717
SF 0.295 0.514 0.561
NFCorpus / Phi-4 CR 0.113 0.566 0.813
SF 0.487 0.663 0.717
Healthcare / GPT-4o-mini CR 0.000 0.693 0.799
SF 0.490 0.577

On average, RAGCRAWLER achieves 66.8% corpus coverage (20.7 percentage points over the previously best-performing IKEA), with peak coverage of 84.4% on NFCorpus, and consistently leads in semantic fidelity (mean SF = 0.605). Surrogate RAGs built from RAGCRAWLER’s reconstructed knowledge also achieve superior answer similarity and ROUGE-L to baselines. Attack cost per dataset is only \$0.33–\$0.53 (Doubao API) or near-zero with open-source LLMs (Qwen) (Yao et al., 22 Jan 2026).

6. Robustness, Limitations, and Implications for Security

RAGCRAWLER maintains high effectiveness against advanced RAG architectures, including systems implementing query rewriting (coverage 74.1%) and multi-query retrieval (coverage 69.5%), outperforming all baselines in these scenarios. Notably, query rewriting can increase information exposure due to improved retrieval diversity.

Strengths:

  • Principled (11/e)(1-1/e)-approximate greedy optimization under ASCP.
  • Global knowledge graph state enabling robust, non-redundant exploration.
  • Strong experimental performance at low computational and monetary cost.
  • High resilience to common defenses (query rewriting, multi-query retrieval).

Limitations:

  • Dependence on reasonably accurate, domain-specific schema priors for knowledge graph construction; excessive domain drift may undermine graph consistency and coverage estimation.
  • Stealth relies on subtle prompt engineering; future sequence-level safety filters could retrospectively detect unnatural coverage maximization behavior.

Static, query-level sanitization is inadequate to defend against RAGCRAWLER. Instead, dynamic, behavior-aware defenses—such as query-provenance analysis that tracks coverage-maximizing patterns across turns—are recommended (Yao et al., 22 Jan 2026).

7. Significance and Research Impact

RAGCRAWLER establishes a near-optimal black-box extraction threat to RAG architectures, quantifying a substantial, previously underappreciated privacy risk in retrieval-augmented AI deployments. The results highlight the need for rigorous sequence-level monitoring and more sophisticated, cross-turn security mechanisms in RAG systems. A plausible implication is that future RAG deployments with sensitive corpora should consider adaptive content exposure analysis and higher-order adversarial testing as standard practice to detect and mitigate stealthy data-exfiltration threats (Yao et al., 22 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to RAGCRAWLER.