Identifying optimal configurations for RAG pipelines under practical constraints
Determine the optimal configurations of retrieval-augmented generation pipelines—including choices of retrievers, embedding models, language models, chunk sizes, and context limits—when labeled data is scarce and query distributions are diverse and unpredictable, given the practical limitations of current evaluation and tuning tools such as RAGGED, RAGAS, and RAGProbe.
References
However, these tools have practical limitations: optimal configurations are often unknown, labeled data is scarce, and the range of possible queries is vast and unpredictable.
                — RAG Without the Lag: Interactive Debugging for Retrieval-Augmented Generation Pipelines
                
                (2504.13587 - Lauro et al., 18 Apr 2025) in Section 2.2 (Related Work: RAG Pipeline Development) — RAG Development Challenges