LLM-Independent Adaptive RAG: Let the Question Speak for Itself (2505.04253v1)

Published 7 May 2025 in cs.CL and cs.LG

Abstract: LLMs~(LLMs) are prone to hallucinations, and Retrieval-Augmented Generation (RAG) helps mitigate this, but at a high computational cost while risking misinformation. Adaptive retrieval aims to retrieve only when necessary, but existing approaches rely on LLM-based uncertainty estimation, which remain inefficient and impractical. In this study, we introduce lightweight LLM-independent adaptive retrieval methods based on external information. We investigated 27 features, organized into 7 groups, and their hybrid combinations. We evaluated these methods on 6 QA datasets, assessing the QA performance and efficiency. The results show that our approach matches the performance of complex LLM-based methods while achieving significant efficiency gains, demonstrating the potential of external information for adaptive retrieval.

Summary

An Evaluation of LLM-Independent Adaptive Retrieval: Leveraging External Information

The paper explores an innovative approach to adaptive retrieval in LLMs using external information, circumventing conventional methods reliant on LLMs' internal states. The paper underscores the challenge of hallucinations seen in LLMs during tasks such as question answering (QA) and the computational cost attached to Retrieval-Augmented Generation (RAG). While existing adaptive methods focus on estimating uncertainty based on LLM-derived information, leading to inefficiencies, this research presents LLM-independent approaches exploiting external sources like entity popularity and question type.

Methodology and Evaluation

The research introduces a framework using simple, external features categorized into seven groups, totaling 27 features, to facilitate adaptive retrieval without engaging LLMs. These include graph features, popularity data, frequency measures, knowledgability scores, question type and complexity, and context relevance. Notably, these features are incorporated into classifiers for predicting the necessity of retrieval, following a structured analysis of QA datasets, including Natural Questions, SQuAD, TriviaQA, 2WikiMultiHopQA, HotpotQA, and MuSiQue.

The paper's results demonstrate comparable performance between the proposed LLM-independent methods and more complex, LLM-centric adaptive retrieval systems. Importantly, the new approach offers substantial efficiency improvements, reducing PFLOPs and the number of LLM calls without compromising retrieval quality. The findings reveal nuanced advantages for specific datasets, notably where complex multi-hop questions are involved, establishing the robustness of external methods over uncertainty estimation for such contexts.

Numerical Results

Table \ref{tab:main_results_new} highlights performance metrics across multiple datasets. The external methods match and occasionally exceed traditional uncertainty-focused approaches, especially in efficiency, measured in retrieval calls and LLM calls. For instance, In-Accuracy scores on challenging datasets like MuSiQue and HotpotQA affirm external features' favorable performance, even surpassing their uncertainty-based counterparts.

Implications and Future Directions

The successful incorporation of external information for adaptive retrieval suggests significant implications for LLM efficiency and scalability. This research contributes an alternative paradigm, particularly advantageous in practical scenarios where computational resources are stretched due to large-scale LLMs. It opens pathways for hybrid models combining internal and external methods that further enhance retrieval accuracy and efficiency.

Theoretically, this shifts the emphasis from internal state dependency to more versatile external feature utilization, broadening the horizon for adaptive systems in NLP. Future studies may explore extended datasets, investigate the adaptability of additional model architectures, and refine external feature categories to maximize their complementarity and applicability across diverse tasks.

In conclusion, by demonstrating the efficacy of external features for adaptive retrieval without reliance on LLM internals, this research sets a precedent for more efficient and resource-conscious approaches in handling hallucination challenges in LLMs.