Fixed-Budget Evidence Assembly

Updated 16 December 2025

Fixed-budget evidence assembly is a framework that enforces strict resource constraints to maximize utility in multi-hop retrieval and causal inference.
It employs algorithmic strategies such as SEAL-RAG and swap rounding to maintain a fixed evidence set and reduce estimator variance.
Empirical benchmarks demonstrate significant gains in evidence precision and experimental efficiency under hard budget constraints.

Fixed-budget evidence assembly encompasses a set of algorithmic and experimental design principles that enforce strict resource constraints—principally on the number or cost of evidence units—while optimizing for estimation precision or downstream generation quality. In multi-hop retrieval-augmented generation (RAG), it serves as a countermeasure to context dilution, ensuring that only the highest-utility evidence is admitted into a fixed-width context window. In causal experimental design, it enables exact control over treatment budgets while reducing estimator variance relative to independent assignment. This entry synthesizes the theoretical grounding, algorithmic methodologies, and empirical findings across leading frameworks for fixed-budget assembly, spanning both automated evidence selection in retrieval tasks and budget-constrained experimental assignment.

1. Formalization of Fixed-Budget Evidence Assembly

Fixed-budget evidence assembly is characterized by the following core constraints:

The evidence set $E$ must satisfy a cardinality or cost constraint, $|E| = k$ or $\sum_{e \in E} \text{cost}(e) = B$ .
The assembly process must maximize the utility of $E$ with respect to downstream objectives—typically, probability of answering a query $q$ (in RAG) or precision of treatment effect estimation (in experiments).

In RAG, the optimization is formalized as

$E^* = \arg\max_{E \subset \mathcal{C},\,|E|=k} P(a \mid q, E),$

where $a$ is the generated answer, $q$ is the input query, $\mathcal{C}$ is the corpus, and $k$ is the evidence budget (Lahmy et al., 11 Dec 2025).

In experimental design, the assignment vector $A \in \{0,1\}^n$ must obey

$\sum_{i} A_i = B, \qquad \mathbb{E}[A_i] = p_i,$

for pre-specified treatment probabilities $p_i$ totaling $B$ (Yamin et al., 15 Jun 2025).

A fixed-budget regime is motivated by the need for predictable cost, reduced variance, and avoidance of resource-wasting context expansion or over-assignment.

2. Algorithms and Methodologies

2.1 SEAL-RAG for Multi-Hop Retrieval

The SEAL-RAG controller implements a fixed-budget, iterative repair loop (“replace, don't expand”) to actively maintain a top- $k$ evidence set free from context dilution. At each iteration:

Search: Retrieve $k$ passages for $q$ .
Extract: Apply Open-IE-style entity and relation extraction to form an Entity Ledger $U_t$ .
Assess: Apply a “sufficiency gate” based on whether $U_t$ covers all entities/relations required for $q$ .
Gap Specification: Compute $G_t = \mathcal{N}(q) \setminus U_t$ .
Loop: For each gap, issue targeted micro-queries, and, using an entity-first utility function,

$S(c \mid U_t) = \lambda_1 \text{GapCov} + \lambda_2 \text{Corr} + \lambda_3 \text{Nov} - \lambda_4 \text{Red},$

propose replacements only if $S(c^*) > S(v) + \epsilon$ , for lowest-utility $v \in E_t$ and candidate $c^* \in C_t$ .

All replacement steps maintain $|E| = k$ throughout. This avoids context dilution, unlike prior methods that accumulate or prune lists without explicit fixed-budget enforcement (Lahmy et al., 11 Dec 2025).

2.2 Dependent Randomized Rounding (Swap Rounding)

In fixed-budget experimental design, dependent randomized rounding (specifically, swap rounding) converts fractional assignment vectors into binary assignments while preserving both marginal probabilities and the budget. The swap rounding algorithm proceeds as follows:

Iteratively select pairs of non-integral entries $(i, j)$ in $p^{(t)}$ , and perform a probabilistic “swap” operation that makes one or both values integral, updating $p^{(t+1)}$ .
Feasibility ( $\sum_{i} p^{(t)}_i = B$ ) and marginal preservation ( $\mathbb{E}[A_i] = p_i$ ) are maintained at every step.
Negative correlations are induced: for swapped pairs, $\operatorname{Cov}(A_i, A_j) < 0$ , which reduces the variance of any linear estimator.

The method operates in $O(n)$ time and space and is guaranteed to yield an assignment satisfying both hard budget and marginal constraints (Yamin et al., 15 Jun 2025).

3. Mathematical Foundations and Theoretical Guarantees

Both RAG and experimental design instantiations of fixed-budget assembly provide rigorous statistical or combinatorial guarantees:

Swap Rounding (Causal Inference):
- For any linear estimator, swap rounding yields strictly smaller variance than independent Bernoulli assignment due to negative pairwise covariances.
- The fixed-budget IPW estimator attains design-based unbiasedness:
$\mathbb{E}[\hat{\tau}_{\text{swap}}] = \tau_{\text{SATE}},$

and the variance decomposition involves explicit covariance terms given by

$\operatorname{Cov}(A_i, A_j) = \begin{cases} -p_i p_j & \text{if } p_i + p_j \leq 1, \ -(1-p_i)(1-p_j) & \text{if } p_i + p_j > 1. \end{cases}$ - Asymptotically, normal confidence intervals with fixed-budget variance estimators retain nominal coverage under mild conditions (Yamin et al., 15 Jun 2025).
SEAL-RAG (Retrieval):
- The sufficiency gate fuses zero-shot LLM signals (Coverage, Corroboration, Contradiction, Answerability) into a stopping criterion, ensuring that only truly gap-closing evidence is admitted.
- The complexity is $O(L\times \text{Retriever}) + O(L\times \text{Extractor}) + O(\text{Generator}_k)$ , with $L$ as the number of repair loops, holding the generation cost at $k$ (Lahmy et al., 11 Dec 2025).

4. Empirical Findings and Benchmark Results

4.1 Multi-Hop RAG Evidence Assembly

Comprehensive experiments on HotpotQA and 2WikiMultiHopQA validate fixed-budget assembly:

Setting	Baseline	Precision@k / Acc (%)	SEAL-RAG	Gain (pp)
HotpotQA, $k=3$	Self-RAG	76 (Prec@3) / 71 (EM)	89 / 77	+13, +6
2Wiki, $k=5$	Adaptive-k	26 (Prec@5) / 66.5 (Acc)	96 / 74.5	+70, +8

All gains are statistically significant ( $p<0.001$ ). SEAL-RAG's replacement loop consistently boosts both answer correctness (+3–19 pp on HotpotQA, +8–40 pp on 2Wiki) and evidence precision (+12–70 pp), demonstrating superior optimization under a hard budget (Lahmy et al., 11 Dec 2025).

4.2 Fixed-Budget Experimental Design

In synthetic and semi-synthetic settings (e.g., IDHP infant health RCT, public housing intervention), swap rounding delivers robust reductions in estimator variance, often outperforming all unbiased baselines:

Covariate-ordered swap rounding IPW achieves the lowest variance (10–50% reduction over alternatives) at moderate sample sizes.
Swap rounding consistently dominates re-randomization, budget-limited Bernoulli, and self-normalized IPW in variance, while remaining unbiased.
Specializations such as covariate-ordered swaps or blockwise application scale to large $n$ while maintaining the budget-exact constraint (Yamin et al., 15 Jun 2025).

5. Practical Implementation and Recommendations

RAG (SEAL-RAG): Entity-anchored extraction is crucial. Micro-queries should be atomic and gap-specific to minimize redundancy and distractor risk. Blocklisted unproductive queries prevent wasted retrieval cycles. Gap prioritization and utility scoring—favoring gap coverage, corroboration, novelty, and penalizing redundancy—are essential for precision. Maintaining a fixed $k$ throughout yields predictable costs.
Experimental Design: Pre-optimize $p_i$ (e.g., by Neyman allocation or logistic regression), ensure $\sum p_i = B$ (via scaling or projection), and clip $p_i$ to $[\epsilon, 1-\epsilon]$ with $\epsilon \approx 0.01$ to avoid extreme weights. For efficiency, use greedy or TSP-style orderings for covariate-ordered swap rounding, and blockwise strategies for massive $n$ .
Both domains benefit from the simplicity and scalability of dependent rounding procedures and iterative, targeted repair under a hard evidence budget.

6. Limitations and Open Questions

Unresolved challenges and active research areas include:

Extension to multi-arm or continuous dose treatments (experimental), or to variable-length context windows with strict upper limits (RAG).
Cluster- or block-randomized design under fixed-budget constraints, and scalable dependent rounding for extremely large $n$ (Yamin et al., 15 Jun 2025).
Integration of covariate-adaptive or sequential evidence assembly procedures.
A plausible implication is that the general utility of negative dependence and active replacement under a fixed budget extends to other domains where resource-constrained selection must avoid redundancy and context dilution.

7. Broader Context and Significance

Fixed-budget evidence assembly is a unifying paradigm for resource-efficient optimization in both information retrieval and experimental design. In multi-hop retrieval-augmented generation, fixed-budget controllers such as SEAL-RAG eliminate context dilution and sharply increase reasoning precision, with quantifiable accuracy and evidence gains on challenging multi-hop datasets (Lahmy et al., 11 Dec 2025). In causal inference, swap rounding sets a new standard for unbiased, variance-minimizing assignment under hard budget constraints, with both theoretical guarantees and empirical superiority over traditional approaches (Yamin et al., 15 Jun 2025). The strict budget constraint is not merely a logistical necessity, but an integral algorithmic device for enhancing the quality and interpretability of inference in complex systems.

PDF Markdown Chat (Pro)

References (2)

Replace, Don't Expand: Mitigating Context Dilution in Multi-Hop RAG via Fixed-Budget Evidence Assembly (2025)

Dependent Randomized Rounding for Budget Constrained Experimental Design (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Fixed-Budget Evidence Assembly.

Fixed-Budget Evidence Assembly

1. Formalization of Fixed-Budget Evidence Assembly

2. Algorithms and Methodologies

2.1 SEAL-RAG for Multi-Hop Retrieval

2.2 Dependent Randomized Rounding (Swap Rounding)

3. Mathematical Foundations and Theoretical Guarantees

4. Empirical Findings and Benchmark Results

4.1 Multi-Hop RAG Evidence Assembly

4.2 Fixed-Budget Experimental Design

5. Practical Implementation and Recommendations

6. Limitations and Open Questions

7. Broader Context and Significance

Whiteboard

Topic to Video (Beta)

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Fixed-Budget Evidence Assembly

1. Formalization of Fixed-Budget Evidence Assembly

2. Algorithms and Methodologies

2.1 SEAL-RAG for Multi-Hop Retrieval

2.2 Dependent Randomized Rounding (Swap Rounding)

3. Mathematical Foundations and Theoretical Guarantees

4. Empirical Findings and Benchmark Results

4.1 Multi-Hop RAG Evidence Assembly

4.2 Fixed-Budget Experimental Design

5. Practical Implementation and Recommendations

6. Limitations and Open Questions

7. Broader Context and Significance

Sponsor

Whiteboard

Topic to Video (Beta)

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research