BudgetLeak: Membership Inference in RAG Systems
- BudgetLeak is a novel membership inference attack that leverages output-length side channels in RAG systems to differentiate member queries from non-members.
- It implements two variants—BudgetLeak-P using sequence modeling and BudgetLeak-Z using clustering—to analyze quality score evolution across varying generation budgets.
- Evaluations on diverse datasets show BudgetLeak-P achieving AUC up to 0.997 and [email protected]% FPR exceeding 40%, highlighting critical privacy vulnerabilities.
BudgetLeak refers to a novel membership inference attack on Retrieval-Augmented Generation (RAG) systems that exploits the generation budget—a side channel created by output-length constraints imposed on LLM generators. Through systematic probing of how answer quality evolves as the allowed output length increases, BudgetLeak distinguishes member queries (queries whose answers were present in the RAG knowledge base) from non-member queries. This approach exposes previously unaddressed privacy risks in RAG systems, particularly in settings where black-box constraints obscure other membership signals (Li et al., 15 Nov 2025).
1. Formalism and Side-Channel Structure
In the RAG setup, a query is submitted to a retriever , which returns top- passages from a private knowledge base . A generator , typically an LLM, then produces an answer . The system enforces a generation budget , limiting the response to at most tokens: this constraint can be externally set or queried.
For each budget and query , define a quality score , where is the ground-truth answer and could be a metric such as cosine similarity of embeddings. Varying exposes a behavioral pattern: as increases, member queries rapidly achieve higher values, while non-member queries plateau early or improve slowly. Formally, for budget increment ,
which typically satisfies , where and denote member and non-member queries, respectively (Li et al., 15 Nov 2025).
2. Attack Methodology
BudgetLeak leverages this side-channel via two key implementations:
- Sequence-Modeling Variant (BudgetLeak-P): For each query, metric vectors are collected by querying under multiple budgets. An attention-based LSTM or Transformer encodes and outputs a membership probability , trained via cross-entropy against labeled shadow data.
- Clustering Variant (BudgetLeak-Z): Extracts statistics from —mean slope, variance, cumulative fluctuation—then clusters vectors into member and non-member groups using -means or fuzzy -means. The cluster with higher final is labeled "member."
These methodologies depend solely on the external interface (black-box access) and assume the ability to probe multiple generation budgets.
3. Experimental Evaluation and Results
BudgetLeak was evaluated on four datasets—HealthCareMagic-100k, MS-MARCO, Natural Questions, and AGNews—using three generators (Meta-Llama-3-8B-Instruct, Mistral-7B-Instruct, GLM-4-9B-Chat) and two retrievers (all-MiniLM-L6-v2, bge-small-en-v1.5, with ).
Performance is measured via Area Under Curve (AUC), accuracy, and TPR@FPR (maximum true positive rate at fixed false positive rate threshold). Results indicate:
- BudgetLeak-P: AUC and Accuracy across all settings, outperforming baselines by 0.20–0.30 absolute AUC.
- [email protected]% FPR: Baselines achieve , while BudgetLeak-P exceeds 40%.
- BudgetLeak-Z: Superior to all baselines in zero-knowledge settings, reaching AUC up to 0.98.
- Efficiency: A full budget sweep uses queries/sample, but a tri-budget version (min/mean/max) with 3 queries maintains AUC (Li et al., 15 Nov 2025).
| Method | AUC | Accuracy | TPR @0.1% FPR |
|---|---|---|---|
| S²MIA-T | 0.542 | 0.518 | 0.0% |
| DC-MIA | 0.587 | 0.587 | 0.0% |
| MBA | 0.730 | 0.735 | 1.8% |
| IA | 0.714 | 0.700 | 1.5% |
| BudgetLeak-P | 0.997 | 0.982 | 42.9% |
4. Analysis of the Budget Side Channel
The theoretical distinction between member and non-member queries is formalized as follows. For members, assume the retriever returns highly relevant passages with probability , and each token improves answer quality by fixed fraction : For non-members, the generator's improvement rate : Thus, the convergence rate gap produces a robust, quantifiable signal exploitable by BudgetLeak under both idealized and practical settings (Li et al., 15 Nov 2025).
5. Limitations and Defenses
Limitations
- Assumes control over or observability of the generation budget parameter.
- Effectiveness is reduced by non-deterministic or randomized retrievers.
- Not applicable to Graph-RAG or multimodal RAG settings where output length is not strongly tied to answer informativeness.
Defenses
Several defenses are proposed, each with associated trade-offs:
- Budget Randomization: Randomizing generation budget introduces noise, disrupting alignment and substantially reducing signal strength.
- Fixed-Length Padding: Always pad or truncate outputs to a fixed length , collapsing the side channel.
- Output Sanitization: Heavily paraphrased postprocessing reduces detectable signals but may degrade utility.
However, randomization and padding can negatively impact user experience (e.g., forced truncation, increased latency), and sanitization may harm answer fidelity (Li et al., 15 Nov 2025).
6. BudgetLeak in the Context of Privacy Budget Management
While BudgetLeak specifically refers to membership inference via the generation budget side channel in RAG systems, the terminology also interfaces with the broader challenge of "budget leak" in differential privacy. The Budget Recycling Differential Privacy (BR-DP) framework addresses related privacy leakage by splitting the total budget between a "DP kernel" and a "recycler," recycling budgeted queries that yield out-of-bound noisy results and tightening composition guarantees (Jiang et al., 18 Mar 2024). In both contexts, "budget leak" denotes a pathway—either side-channel in RAG or excess information in DP composition—by which confidential information may be inadvertently exposed.
7. Significance and Broader Impact
BudgetLeak exposes a fundamental vulnerability in RAG deployments, especially those relying on private or proprietary corpora. The attack demonstrates that even innocuous system behaviors, such as output length limitation, can act as subtle yet potent privacy side channels. This underscores the necessity for comprehensive privacy auditing in the design and deployment of next-generation RAG and LLM-driven systems and sharpens the focus on the interplay between utility, privacy, and system affordances. Recommended defenses entail managing trade-offs between privacy and usability, reinforcing the imperative for robust system design (Li et al., 15 Nov 2025).
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days free