RAGCRAWLER: Adaptive Attacks on RAG Systems
- RAGCRAWLER is a knowledge graph-based adaptive crawler attack on RAG systems that maximizes corpus coverage through principled multi-turn querying.
- It employs an Adaptive Stochastic Coverage Problem formulation to guide query planning with theoretical guarantees and strong experimental performance in coverage and semantic fidelity.
- The method leverages evolving knowledge graphs and upper confidence bound strategies to balance exploration and exploitation while evading security defenses.
RAGCRAWLER is a knowledge graph–guided adaptive crawler attack on retrieval-augmented generation (RAG) systems, designed to exfiltrate a maximal portion of a confidential corpus through principled, stealthy, and efficient multi-turn querying. By formulating the extraction procedure as an Adaptive Stochastic Coverage Problem (ASCP), RAGCRAWLER supersedes purely heuristic, locally greedy strategies, providing strong theoretical guarantees on extraction efficiency and demonstrating broad practical effectiveness and robustness against existing security defenses (Yao et al., 22 Jan 2026).
1. Threat Model and Formalism
RAGCRAWLER targets black-box RAG services , where is a document retriever, is a LLM generator, and is a confidential document corpus. An adversarial user issues a series of innocuous-appearing natural language queries and receives only generated answers , with no access to retrieval scores or document IDs. The adversary's objective is to maximize the fraction of unique documents surfaced from within a fixed query budget :
Existing multi-turn extraction attacks such as RAGThief and IKEA employ local heuristics and are prone to semantic drift or redundant exploration, lacking principled long-term coverage planning and offering no formal performance guarantees. RAGCRAWLER is constructed to address these fundamental shortcomings (Yao et al., 22 Jan 2026).
2. Framing as Adaptive Stochastic Coverage Problem
RAG crawling is rigorously modeled as an ASCP, capturing the sequential, uncertainty-laden nature of the attack:
- Universe: (documents)
- Action space: (all possible queries)
- Random outcome: , the documents retrieved by
- Objective: Maximize expected corpus coverage with coverage function
- Conditional Marginal Gain (CMG): For candidate given past queries/answers ,
Adaptive submodularity supports that a sequential greedy policy achieves a -approximation to optimal expected coverage. Challenges include unobservability of , intractability of the infinite action space , and robustness to query rewriting/paraphrasing in real-world systems (Yao et al., 22 Jan 2026).
3. Knowledge Graph Construction and Attacker State
At each turn, RAGCRAWLER maintains an attacker-local, evolving knowledge graph :
- Entities (): Canonicalized mentions (e.g., “Patient A”, “symptom”).
- Relations (): Domain-specific types (e.g., “has_symptom”, “treated_by”).
- Edges (): Observed entity-relation-entity triples.
After receiving , a two-pass LLM-based extraction-reflection phase yields a subgraph . Entities and relations are semantically merged using embedded cosine similarity to avoid surface-form redundancy, so that increments in correlate with genuine corpus coverage increase. This structure enables the estimation of CMG by tracking growth in the knowledge graph rather than directly observing retrieved documents (Yao et al., 22 Jan 2026).
4. CMG Estimation and Query Planning in Semantic Space
RAGCRAWLER’s query planning involves two main components:
- Entity Selection via UCB and Graph Priors: For each , empirical payoffs are computed based on novel node/edge discovery; an upper confidence bound encourages both exploitation and exploration, augmented by structural graph scores (DegreeScore, AdjScore) reflecting local connectivity and under-explored relations. Entities are sampled from a Top- softmax to preserve diversity.
- Relation Selection: Given , the scheduler picks relation as the maximizer of the “deficit” score among unexplored relations, provided the potential information gain is significant.
Queries are generated from using LLM-driven templates, and history-aware deduplication prevents semantic redundancy by enforcing a high threshold of dissimilarity with previous queries. All embedding computations employ pre-trained sentence transformers.
The sequential workflow is efficiently implemented, with per-query runtime dominated by at most three LLM calls and amortized complexity , thanks to caching and lazy updates post-graph expansion (Yao et al., 22 Jan 2026).
5. Experimental Results and Evaluation Metrics
RAGCRAWLER is evaluated on four benchmarks: TREC-COVID, SciDocs, NFCorpus, and Healthcare-Magic, with datasets of 1,000 sampled documents. Victim RAGs instantiate two bi-encoder retrievers (BGE, GTE) and four LLM generators (Llama-3-8B, Command-R-7B, Microsoft Phi-4, GPT-4o-mini with guardrails), including variants with query rewriting and multi-query retrieval.
The following table summarizes core results for coverage rate (CR) and semantic fidelity (SF), comparing RAGCRAWLER to RAGThief and IKEA:
| Dataset / Generator | Metric | RAGThief | IKEA | RAGCRAWLER |
|---|---|---|---|---|
| TREC-COVID / Llama-3-8B | CR | 0.131 | 0.161 | 0.494 |
| SF | 0.447 | 0.495 | 0.591 | |
| SciDocs / Command-R | CR | 0.093 | 0.522 | 0.717 |
| SF | 0.295 | 0.514 | 0.561 | |
| NFCorpus / Phi-4 | CR | 0.113 | 0.566 | 0.813 |
| SF | 0.487 | 0.663 | 0.717 | |
| Healthcare / GPT-4o-mini | CR | 0.000 | 0.693 | 0.799 |
| SF | – | 0.490 | 0.577 |
On average, RAGCRAWLER achieves 66.8% corpus coverage (20.7 percentage points over the previously best-performing IKEA), with peak coverage of 84.4% on NFCorpus, and consistently leads in semantic fidelity (mean SF = 0.605). Surrogate RAGs built from RAGCRAWLER’s reconstructed knowledge also achieve superior answer similarity and ROUGE-L to baselines. Attack cost per dataset is only \$0.33–\$0.53 (Doubao API) or near-zero with open-source LLMs (Qwen) (Yao et al., 22 Jan 2026).
6. Robustness, Limitations, and Implications for Security
RAGCRAWLER maintains high effectiveness against advanced RAG architectures, including systems implementing query rewriting (coverage 74.1%) and multi-query retrieval (coverage 69.5%), outperforming all baselines in these scenarios. Notably, query rewriting can increase information exposure due to improved retrieval diversity.
Strengths:
- Principled -approximate greedy optimization under ASCP.
- Global knowledge graph state enabling robust, non-redundant exploration.
- Strong experimental performance at low computational and monetary cost.
- High resilience to common defenses (query rewriting, multi-query retrieval).
Limitations:
- Dependence on reasonably accurate, domain-specific schema priors for knowledge graph construction; excessive domain drift may undermine graph consistency and coverage estimation.
- Stealth relies on subtle prompt engineering; future sequence-level safety filters could retrospectively detect unnatural coverage maximization behavior.
Static, query-level sanitization is inadequate to defend against RAGCRAWLER. Instead, dynamic, behavior-aware defenses—such as query-provenance analysis that tracks coverage-maximizing patterns across turns—are recommended (Yao et al., 22 Jan 2026).
7. Significance and Research Impact
RAGCRAWLER establishes a near-optimal black-box extraction threat to RAG architectures, quantifying a substantial, previously underappreciated privacy risk in retrieval-augmented AI deployments. The results highlight the need for rigorous sequence-level monitoring and more sophisticated, cross-turn security mechanisms in RAG systems. A plausible implication is that future RAG deployments with sensitive corpora should consider adaptive content exposure analysis and higher-order adversarial testing as standard practice to detect and mitigate stealthy data-exfiltration threats (Yao et al., 22 Jan 2026).