Papers
Topics
Authors
Recent
2000 character limit reached

Fixed-Budget Evidence Assembly

Updated 16 December 2025
  • Fixed-budget evidence assembly is a framework that enforces strict resource constraints to maximize utility in multi-hop retrieval and causal inference.
  • It employs algorithmic strategies such as SEAL-RAG and swap rounding to maintain a fixed evidence set and reduce estimator variance.
  • Empirical benchmarks demonstrate significant gains in evidence precision and experimental efficiency under hard budget constraints.

Fixed-budget evidence assembly encompasses a set of algorithmic and experimental design principles that enforce strict resource constraints—principally on the number or cost of evidence units—while optimizing for estimation precision or downstream generation quality. In multi-hop retrieval-augmented generation (RAG), it serves as a countermeasure to context dilution, ensuring that only the highest-utility evidence is admitted into a fixed-width context window. In causal experimental design, it enables exact control over treatment budgets while reducing estimator variance relative to independent assignment. This entry synthesizes the theoretical grounding, algorithmic methodologies, and empirical findings across leading frameworks for fixed-budget assembly, spanning both automated evidence selection in retrieval tasks and budget-constrained experimental assignment.

1. Formalization of Fixed-Budget Evidence Assembly

Fixed-budget evidence assembly is characterized by the following core constraints:

  • The evidence set EE must satisfy a cardinality or cost constraint, E=k|E| = k or eEcost(e)=B\sum_{e \in E} \text{cost}(e) = B.
  • The assembly process must maximize the utility of EE with respect to downstream objectives—typically, probability of answering a query qq (in RAG) or precision of treatment effect estimation (in experiments).

In RAG, the optimization is formalized as

E=argmaxEC,E=kP(aq,E),E^* = \arg\max_{E \subset \mathcal{C},\,|E|=k} P(a \mid q, E),

where aa is the generated answer, qq is the input query, C\mathcal{C} is the corpus, and kk is the evidence budget (Lahmy et al., 11 Dec 2025).

In experimental design, the assignment vector A{0,1}nA \in \{0,1\}^n must obey

iAi=B,E[Ai]=pi,\sum_{i} A_i = B, \qquad \mathbb{E}[A_i] = p_i,

for pre-specified treatment probabilities pip_i totaling BB (Yamin et al., 15 Jun 2025).

A fixed-budget regime is motivated by the need for predictable cost, reduced variance, and avoidance of resource-wasting context expansion or over-assignment.

2. Algorithms and Methodologies

2.1 SEAL-RAG for Multi-Hop Retrieval

The SEAL-RAG controller implements a fixed-budget, iterative repair loop (“replace, don't expand”) to actively maintain a top-kk evidence set free from context dilution. At each iteration:

  1. Search: Retrieve kk passages for qq.
  2. Extract: Apply Open-IE-style entity and relation extraction to form an Entity Ledger UtU_t.
  3. Assess: Apply a “sufficiency gate” based on whether UtU_t covers all entities/relations required for qq.
  4. Gap Specification: Compute Gt=N(q)UtG_t = \mathcal{N}(q) \setminus U_t.
  5. Loop: For each gap, issue targeted micro-queries, and, using an entity-first utility function,

S(cUt)=λ1GapCov+λ2Corr+λ3Novλ4Red,S(c \mid U_t) = \lambda_1 \text{GapCov} + \lambda_2 \text{Corr} + \lambda_3 \text{Nov} - \lambda_4 \text{Red},

propose replacements only if S(c)>S(v)+ϵS(c^*) > S(v) + \epsilon, for lowest-utility vEtv \in E_t and candidate cCtc^* \in C_t.

All replacement steps maintain E=k|E| = k throughout. This avoids context dilution, unlike prior methods that accumulate or prune lists without explicit fixed-budget enforcement (Lahmy et al., 11 Dec 2025).

2.2 Dependent Randomized Rounding (Swap Rounding)

In fixed-budget experimental design, dependent randomized rounding (specifically, swap rounding) converts fractional assignment vectors into binary assignments while preserving both marginal probabilities and the budget. The swap rounding algorithm proceeds as follows:

  • Iteratively select pairs of non-integral entries (i,j)(i, j) in p(t)p^{(t)}, and perform a probabilistic “swap” operation that makes one or both values integral, updating p(t+1)p^{(t+1)}.
  • Feasibility (ipi(t)=B\sum_{i} p^{(t)}_i = B) and marginal preservation (E[Ai]=pi\mathbb{E}[A_i] = p_i) are maintained at every step.
  • Negative correlations are induced: for swapped pairs, Cov(Ai,Aj)<0\operatorname{Cov}(A_i, A_j) < 0, which reduces the variance of any linear estimator.

The method operates in O(n)O(n) time and space and is guaranteed to yield an assignment satisfying both hard budget and marginal constraints (Yamin et al., 15 Jun 2025).

3. Mathematical Foundations and Theoretical Guarantees

Both RAG and experimental design instantiations of fixed-budget assembly provide rigorous statistical or combinatorial guarantees:

  • Swap Rounding (Causal Inference):
    • For any linear estimator, swap rounding yields strictly smaller variance than independent Bernoulli assignment due to negative pairwise covariances.
    • The fixed-budget IPW estimator attains design-based unbiasedness:

    E[τ^swap]=τSATE,\mathbb{E}[\hat{\tau}_{\text{swap}}] = \tau_{\text{SATE}},

    and the variance decomposition involves explicit covariance terms given by

    Cov(Ai,Aj)={pipjif pi+pj1, (1pi)(1pj)if pi+pj>1.\operatorname{Cov}(A_i, A_j) = \begin{cases} -p_i p_j & \text{if } p_i + p_j \leq 1, \ -(1-p_i)(1-p_j) & \text{if } p_i + p_j > 1. \end{cases} - Asymptotically, normal confidence intervals with fixed-budget variance estimators retain nominal coverage under mild conditions (Yamin et al., 15 Jun 2025).

  • SEAL-RAG (Retrieval):

    • The sufficiency gate fuses zero-shot LLM signals (Coverage, Corroboration, Contradiction, Answerability) into a stopping criterion, ensuring that only truly gap-closing evidence is admitted.
    • The complexity is O(L×Retriever)+O(L×Extractor)+O(Generatork)O(L\times \text{Retriever}) + O(L\times \text{Extractor}) + O(\text{Generator}_k), with LL as the number of repair loops, holding the generation cost at kk (Lahmy et al., 11 Dec 2025).

4. Empirical Findings and Benchmark Results

4.1 Multi-Hop RAG Evidence Assembly

Comprehensive experiments on HotpotQA and 2WikiMultiHopQA validate fixed-budget assembly:

Setting Baseline Precision@k / Acc (%) SEAL-RAG Gain (pp)
HotpotQA, k=3k=3 Self-RAG 76 (Prec@3) / 71 (EM) 89 / 77 +13, +6
2Wiki, k=5k=5 Adaptive-k 26 (Prec@5) / 66.5 (Acc) 96 / 74.5 +70, +8

All gains are statistically significant (p<0.001p<0.001). SEAL-RAG's replacement loop consistently boosts both answer correctness (+3–19 pp on HotpotQA, +8–40 pp on 2Wiki) and evidence precision (+12–70 pp), demonstrating superior optimization under a hard budget (Lahmy et al., 11 Dec 2025).

4.2 Fixed-Budget Experimental Design

In synthetic and semi-synthetic settings (e.g., IDHP infant health RCT, public housing intervention), swap rounding delivers robust reductions in estimator variance, often outperforming all unbiased baselines:

  • Covariate-ordered swap rounding IPW achieves the lowest variance (10–50% reduction over alternatives) at moderate sample sizes.
  • Swap rounding consistently dominates re-randomization, budget-limited Bernoulli, and self-normalized IPW in variance, while remaining unbiased.
  • Specializations such as covariate-ordered swaps or blockwise application scale to large nn while maintaining the budget-exact constraint (Yamin et al., 15 Jun 2025).

5. Practical Implementation and Recommendations

  • RAG (SEAL-RAG): Entity-anchored extraction is crucial. Micro-queries should be atomic and gap-specific to minimize redundancy and distractor risk. Blocklisted unproductive queries prevent wasted retrieval cycles. Gap prioritization and utility scoring—favoring gap coverage, corroboration, novelty, and penalizing redundancy—are essential for precision. Maintaining a fixed kk throughout yields predictable costs.
  • Experimental Design: Pre-optimize pip_i (e.g., by Neyman allocation or logistic regression), ensure pi=B\sum p_i = B (via scaling or projection), and clip pip_i to [ϵ,1ϵ][\epsilon, 1-\epsilon] with ϵ0.01\epsilon \approx 0.01 to avoid extreme weights. For efficiency, use greedy or TSP-style orderings for covariate-ordered swap rounding, and blockwise strategies for massive nn.
  • Both domains benefit from the simplicity and scalability of dependent rounding procedures and iterative, targeted repair under a hard evidence budget.

6. Limitations and Open Questions

Unresolved challenges and active research areas include:

  • Extension to multi-arm or continuous dose treatments (experimental), or to variable-length context windows with strict upper limits (RAG).
  • Cluster- or block-randomized design under fixed-budget constraints, and scalable dependent rounding for extremely large nn (Yamin et al., 15 Jun 2025).
  • Integration of covariate-adaptive or sequential evidence assembly procedures.
  • A plausible implication is that the general utility of negative dependence and active replacement under a fixed budget extends to other domains where resource-constrained selection must avoid redundancy and context dilution.

7. Broader Context and Significance

Fixed-budget evidence assembly is a unifying paradigm for resource-efficient optimization in both information retrieval and experimental design. In multi-hop retrieval-augmented generation, fixed-budget controllers such as SEAL-RAG eliminate context dilution and sharply increase reasoning precision, with quantifiable accuracy and evidence gains on challenging multi-hop datasets (Lahmy et al., 11 Dec 2025). In causal inference, swap rounding sets a new standard for unbiased, variance-minimizing assignment under hard budget constraints, with both theoretical guarantees and empirical superiority over traditional approaches (Yamin et al., 15 Jun 2025). The strict budget constraint is not merely a logistical necessity, but an integral algorithmic device for enhancing the quality and interpretability of inference in complex systems.

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Fixed-Budget Evidence Assembly.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube