Retrieve, Summarize, Plan: Advancing Multi-hop Question Answering with an Iterative Approach (2407.13101v1)

Published 18 Jul 2024 in cs.CL and cs.AI

Abstract: Multi-hop question answering is a challenging task with distinct industrial relevance, and Retrieval-Augmented Generation (RAG) methods based on LLMs have become a popular approach to tackle this task. Owing to the potential inability to retrieve all necessary information in a single iteration, a series of iterative RAG methods has been recently developed, showing significant performance improvements. However, existing methods still face two critical challenges: context overload resulting from multiple rounds of retrieval, and over-planning and repetitive planning due to the lack of a recorded retrieval trajectory. In this paper, we propose a novel iterative RAG method called ReSP, equipped with a dual-function summarizer. This summarizer compresses information from retrieved documents, targeting both the overarching question and the current sub-question concurrently. Experimental results on the multi-hop question-answering datasets HotpotQA and 2WikiMultihopQA demonstrate that our method significantly outperforms the state-of-the-art, and exhibits excellent robustness concerning context length.

PDF HTML Abstract

Iterative Retrieval-Augmented Generation in Multi-hop Question Answering: The ReSP Approach

The paper "Retrieve, Summarize, Plan: Advancing Multi-hop Question Answering with an Iterative Approach" presents a novel approach to improving multi-hop question answering systems using an iterative Retrieval-Augmented Generation (RAG) framework equipped with specific summarization and planning capabilities. This research addresses significant challenges posed by multi-hop question answering, a task relevant in the field of intelligent assistants and generative search mechanisms.

As multi-hop question answering requires the integration of information from multiple sources to navigate through reasoning steps effectively, conventional RAG methods using single iterations often fall short. The paper acknowledges this and proposes the ReSP method, which extends the capacity of iterative RAG by introducing a dual-function summarizer aimed at alleviating the limitations like context overload and redundant planning typically observed in existing methods.

Key Contributions and Methodology

The proposed ReSP (Retrieve, Summarize, Plan) paradigm enhances the multi-hop question answering framework by integrating a novel LLM-based summarizer, which carries out two pivotal roles. The summarizer compresses the retrieved information for both the overarching question and the sub-questions simultaneously, maintaining two distinct memory pathways: the global evidence memory and the local pathway memory. The global evidence memory aids in keeping track of comprehensive supporting information for the main question, thereby preventing over-planning when sufficient information has been accumulated. Meanwhile, the local pathway memory tracks the planning trajectory to prevent repetitive planning.

The modular architecture comprising of a Reasoner, Retriever, Summarizer, and Generator in the ReSP approach allows independent operation of processes, enabling flexible model scaling and efficient task execution. This design accommodates varying execution needs without the explicit requirement for model fine-tuning, thus promoting adaptability in handling information overload and planning redundancy.

Experimental Results

The ReSP model was evaluated on standard benchmarks such as HotpotQA and 2WikiMultihopQA. It demonstrated significant improvements in F1 score, surpassing state-of-the-art methods by 4.1 points on HotpotQA and 5.9 points on 2WikiMultihopQA. These improvements highlight the model's efficacy in managing context length and maintaining robust performance across a variety of retrieval iterations. Moreover, the deployment of different base model sizes in its modules showed that while a larger model for the Generator enhances performance, the Summarizer and Reasoner modules might not benefit from the increased model size, indicating areas for resource-efficient deployment.

Implications and Future Directions

The research with ReSP provides a noteworthy contribution to improving the robustness and efficacy of multi-hop question answering systems by effectively managing retrieval cycles and summarizing intermediate step outputs. This work demonstrates the importance of modularity in AI systems, allowing more precise control over complex operations without succumbing to logistical drawbacks inherent in single-round retrieval methods.

Practically, ReSP offers a scalable and adaptable solution that could be integrated into broader knowledge retrieval applications, enhancing both question-answering accuracy and computational efficiency. Theoretically, it opens avenues for further exploration into optimizing individual module operations within RAG systems and the potential use of query-focused summarization strategies in different AI contexts. Future research could investigate optimizing retrieval strategies, potentially leveraging real-time adaptation mechanisms to further improve the dynamism and responsiveness of multi-hop question-answering frameworks.