Active Retrieval Augmented Generation (2305.06983v2)

Published 11 May 2023 in cs.CL and cs.LG

Abstract: Despite the remarkable ability of LLMs (LMs) to comprehend and generate language, they have a tendency to hallucinate and create factually inaccurate output. Augmenting LMs by retrieving information from external knowledge resources is one promising solution. Most existing retrieval augmented LMs employ a retrieve-and-generate setup that only retrieves information once based on the input. This is limiting, however, in more general scenarios involving generation of long texts, where continually gathering information throughout generation is essential. In this work, we provide a generalized view of active retrieval augmented generation, methods that actively decide when and what to retrieve across the course of the generation. We propose Forward-Looking Active REtrieval augmented generation (FLARE), a generic method which iteratively uses a prediction of the upcoming sentence to anticipate future content, which is then utilized as a query to retrieve relevant documents to regenerate the sentence if it contains low-confidence tokens. We test FLARE along with baselines comprehensively over 4 long-form knowledge-intensive generation tasks/datasets. FLARE achieves superior or competitive performance on all tasks, demonstrating the effectiveness of our method. Code and datasets are available at https://github.com/jzbjyb/FLARE.

PDF Abstract

Active Retrieval Augmented Generation for Enhanced Text Generation

Introduction

The integration of external knowledge into the generative process of LLMs (LMs) presents a promising solution to the inherent limitations of these models, specifically their tendency to generate factually incorrect or hallucinated content. The concept of active retrieval augmented generation has been explored to tackle this issue, enabling LMs to dynamically incorporate relevant external information during the text generation process. This approach is vital for improving the accuracy and reliability of LMs in producing long-form, knowledge-intensive content.

Retrieval Augmented Generation Framework

At the core of this paper is the presentation of an advanced framework for retrieval augmented generation that actively determines when and what external information to retrieve throughout the generative process. Key notations and definitions central to understanding single-time and active retrieval augmented generation are introduced, setting the stage for a deeper dive into the proposed methods.

Traditional vs. Active Retrieval Approaches

Previous works have largely focused on single-time retrieval or passive multi-time retrieval strategies, which either limit retrieval to once at the beginning of generation or rely on arbitrarily fixed intervals. These methods often fail to capture the intent of future text generation accurately and may retrieve information at inappropriate moments. On the other hand, active retrieval augmented generation, exemplified by the Forward-Looking Active REtrieval augmented generation (FLARE) method, represents a significant shift towards a more dynamic and context-sensitive retrieval process.

FLARE Methodology

FLARE iteratively generates a forward-looking prediction of the next sentence to assess the necessity for external information retrieval. This prediction serves as a basis for determining when to execute retrieval actions, thus aligning the retrieval process more closely with the model's generative intent. The method exhibits a nuanced understanding of the LLM's knowledge limitations and leverages active retrieval to enhance content accuracy significantly.

Experimental Findings

Comprehensive evaluations across diverse tasks such as multihop question answering, commonsense reasoning, long-form question answering, and open-domain summarization firmly establish the efficacy of the FLARE method. These findings demonstrate not only superior performance over single-time and passive multi-time retrieval baselines but also the versatility of FLARE across various long-form, knowledge-intensive text generation scenarios.

Implications and Future Directions

The implications of this research extend beyond immediate performance enhancements, promising to redefine the capabilities of generative LMs in knowledge-intensive applications. The active retrieval approach bridges the gap between generative prowess and factual accuracy, moving closer to human-like content creation.

Future developments might focus on refining the retrieval mechanisms within this framework to optimize efficiency and accuracy further. Additionally, exploring the integration of FLARE with more advanced LLMs or diverse knowledge sources could unfold new possibilities in generating more coherent, contextually rich, and factually accurate texts.

Conclusion

Active retrieval augmented generation, particularly through the FLARE method, marks a significant progression in enhancing LLMs' ability to generate factually correct and contextually relevant text by dynamically incorporating external information. This research opens new avenues for improving the generative quality and applicability of LMs in various knowledge-driven text generation tasks.