Learning to Filter Context for Retrieval-Augmented Generation (2311.08377v1)

Published 14 Nov 2023 in cs.CL and cs.AI

Abstract: On-the-fly retrieval of relevant knowledge has proven an essential element of reliable systems for tasks such as open-domain question answering and fact verification. However, because retrieval systems are not perfect, generation models are required to generate outputs given partially or entirely irrelevant passages. This can cause over- or under-reliance on context, and result in problems in the generated output such as hallucinations. To alleviate these problems, we propose FILCO, a method that improves the quality of the context provided to the generator by (1) identifying useful context based on lexical and information-theoretic approaches, and (2) training context filtering models that can filter retrieved contexts at test time. We experiment on six knowledge-intensive tasks with FLAN-T5 and LLaMa2, and demonstrate that our method outperforms existing approaches on extractive question answering (QA), complex multi-hop and long-form QA, fact verification, and dialog generation tasks. FILCO effectively improves the quality of context, whether or not it supports the canonical output.

PDF Abstract

Enhanced Context Filtering in Retrieval-Augmented Generation Models

Introduction to FILCO

Retrieval-augmented generation has shown promise in enhancing the quality of responses in various knowledge-intensive tasks, including open-domain question answering and fact verification. However, the imperfect nature of retrieval systems often results in the incorporation of irrelevant or partially relevant passages into the generation process. This, in turn, can lead to over-reliance on incorrect context and generate outputs with hallucinations or inaccuracies. Addressing this issue, "Learning to Filter Context for Retrieval-Augmented Generation" presents FILCO, a novel approach designed to improve the context quality by efficiently filtering out irrelevant content. FILCO employs lexical and information-theoretic measures to identify useful context and trains context filtering models that dynamically refine the input at test time.

The FILCO Methodology

FILCO operates by applying a fine-grained sentence-wise filtering approach, leveraging three key measures:

String Inclusion (STRINC): Identifies if passages lexically contain the expected output, aiming at extractive tasks where the answer might be directly found within the text.
Lexical Overlap: Measures unigram overlap between the question and context, optimizing for tasks where higher topic similarity is crucial.
Conditional Cross-mutual Information (CXMI): Computes the probability difference in generating the correct output with or without the given context, suitable for more complex generative tasks.

Combining these methods, FILCO not only refines the context for generation tasks by excluding extraneous content but also shortens the input, thereby reducing computational costs.

Experimental Validation

The paper evaluates FILCO across six knowledge-intensive language tasks using FLAN-T5 and LLAMA 2 models. FILCO consistently outperforms baseline methods in extractive question answering, complex multi-hop and long-form question answering, fact verification, and dialog generation tasks. Moreover, it demonstrates the ability to reduce the prompt length by 44% to 64% across tasks without sacrificing performance. The effectiveness of FILCO is examined under scenarios involving both positive and negatively retrieved passages, highlighting its robustness in improving context quality regardless of the initial retrieval accuracy.

Implications and Future Directions

The introduction of FILCO marks a significant step toward refining the incorporation of external knowledge into generative models. By demonstrating substantial improvements across a range of tasks, FILCO not only addresses the immediate challenge of handling imperfectly retrieved information but also opens new avenues for research in retrieval-augmented generation. Future work could explore the application of FILCO in other domains, assess its performance with even larger models, or investigate alternative measures for context filtering. The adaptability of FILCO suggests its potential as a foundational component in the evolving landscape of generative AI.

Conclusion

FILCO emerges as a promising method for enhancing the reliability and efficiency of retrieval-augmented generation tasks. By intelligently filtering the context provided to the generator, FILCO mitigates the impact of irrelevant passages, enabling more accurate and focused generation output. This approach contributes significantly to the ongoing quest for more sophisticated and effective AI systems capable of leveraging vast stores of knowledge. As we continue to expand the boundaries of what AI can achieve, the refinement of such foundational technologies will be paramount in realizing the full potential of machine learning and natural language processing.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Zhiruo Wang (18 papers)
Jun Araki (11 papers)
Zhengbao Jiang (25 papers)
Md Rizwan Parvez (24 papers)
Graham Neubig (342 papers)

Citations (44)

View on Semantic Scholar