Choosing the Next RAG Configuration to Evaluate

Develop principled methods for selecting the next RAG algorithm configuration to evaluate within the \plan exploration process when searching for the quality–performance Pareto frontier, taking into account heterogeneous adjustment costs—such as low-cost changes (model swaps, Top-K adjustments) versus high-cost database rebuilds (index type changes, chunking modifications, or re-encoding with different embedding models)—to minimize costly evaluations while efficiently exploring the configuration space.

Background

The plan explorer \plan iteratively evaluates generation quality and predicts performance (via \perf and \ir) to approach the Pareto frontier. In realistic deployments, the joint algorithm–system configuration space is large, and simple grid search is often impractical.

Different configuration changes incur widely varying costs: swapping models or adjusting Top-K is relatively cheap, while rebuilding databases (changing index type, chunking strategy, or re-embedding) is expensive. Efficiently choosing the next configuration is therefore critical to reduce exploration cost while finding near-optimal solutions.

References

An open question here is how to choose the next configuration to evaluate (line 8 in Algorithm~\ref{alg:pareto-search}).

— RAG-Stack: Co-Optimizing RAG Quality and Performance From the Vector Database Perspective (2510.20296 - Jiang, 23 Oct 2025) in Research Gap 3, Section 4.3 (\plan for Configuration Space Navigation)

Choosing the Next RAG Configuration to Evaluate

Background

References

Related Problems