Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Retrieval-Generation Synergy Augmented Large Language Models (2310.05149v1)

Published 8 Oct 2023 in cs.CL
Retrieval-Generation Synergy Augmented Large Language Models

Abstract: LLMs augmented with task-relevant documents have demonstrated impressive performance on knowledge-intensive tasks. However, regarding how to obtain effective documents, the existing methods are mainly divided into two categories. One is to retrieve from an external knowledge base, and the other is to utilize LLMs to generate documents. We propose an iterative retrieval-generation collaborative framework. It is not only able to leverage both parametric and non-parametric knowledge, but also helps to find the correct reasoning path through retrieval-generation interactions, which is very important for tasks that require multi-step reasoning. We conduct experiments on four question answering datasets, including single-hop QA and multi-hop QA tasks. Empirical results show that our method significantly improves the reasoning ability of LLMs and outperforms previous baselines.

Introduction

The recently proposed synergy framework, ITRG (Iterative Retrieval-Generation), marks a significant advancement in the field of augmenting LLMs with the ability to perform tasks requiring intensive knowledge access. Unlike traditional methods, which either retrieve documents from an external knowledge base or generate documents using LLMs, ITRG operates through a collaborative approach integrating both retrieval and generation processes. This iterative interaction not only enhances task-specific knowledge utilization but also streamlines the discovery of accurate reasoning paths essential for multi-hop question answering.

Iterative Retrieval-Generation Synergy

The framework is engineered around alternating between two core components – Generation Augmented Retrieval (GAR) and Retrieval Augmented Generation (RAG). GAR entails expanding queries by merging pseudo-documents, produced by the model itself, with the original questions. This expansion significantly refines the accuracy of document retrieval. Conversely, RAG involves generating new documents based on the comprehension of both original questions and documents procured during retrieval. The synergy between these strategies refines the generated answers through successive iterations, allowing for complex, multi-staged reasoning to be captured within the LLM's responses.

Experimental Results

On a suite of four prominent datasets spanning single and multi-hop question answering, including Natural Questions, TriviaQA, 2WikiMultiHopQA, and HotpotQA, ITRG showcases superior performance compared to existing benchmarks. This improvement is even more pronounced in iterative settings, emphasizing the framework's iterative nature. Notably, the framework outperformed competing models in various shots (0-shot, 1-shot, 5-shot) settings, with ITRG's 'refresh' strategy demonstrating remarkable scores, especially in zero-shot settings, highlighting its adeptness at utilizing LLMs for accurate knowledge synthesis without additional fine-tuning.

Conclusion

The ITRG framework emerges as a robust method that significantly elevates the reasoning capabilities of LLMs on knowledge-intensive tasks. By harnessing parametric and non-parametric knowledge, ITRG systematically outperforms previous retrieval-augmented methods. The integration of an iterative retrieval-generation loop elegantly resolves the task of accessing and refining the relevant information needed for complex question-answering. The work aptly illustrates the promise of iterative, integrated approaches for enhancing the performance of LLMs when confronted with the need for deep, multi-faceted knowledge understanding.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. T. Brown et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
  2. J. Hoffmann et al., “Training compute-optimal large language models,” 2022.
  3. A. Zeng et al., “Glm-130b: An open bilingual pre-trained model,” arXiv preprint arXiv:2210.02414, 2022.
  4. A. Chowdhery et al., “Palm: Scaling language modeling with pathways,” arXiv preprint arXiv:2204.02311, 2022.
  5. OpenAI, “Gpt-4 technical report,” 2023.
  6. H. Touvron et al., “Llama: Open and efficient foundation language models,” 2023.
  7. K. Lee, M.-W. Chang, and K. Toutanova, “Latent retrieval for weakly supervised open domain question answering,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.   Florence, Italy: Association for Computational Linguistics, Jul. 2019, pp. 6086–6096. [Online]. Available: https://aclanthology.org/P19-1612
  8. R. Zellers, Y. Bisk, R. Schwartz, and Y. Choi, “SWAG: A large-scale adversarial dataset for grounded commonsense inference,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.   Brussels, Belgium: Association for Computational Linguistics, Oct.-Nov. 2018, pp. 93–104. [Online]. Available: https://www.aclweb.org/anthology/D18-1009
  9. O. Ram et al., “In-context retrieval-augmented language models,” arXiv preprint arXiv:2302.00083, 2023.
  10. O. Khattab et al., “Demonstrate-search-predict: Composing retrieval and language models for knowledge-intensive nlp,” 2023.
  11. W. Shi et al., “Replug: Retrieval-augmented black-box language models,” arXiv preprint arXiv:2301.12652, 2023.
  12. W. Yu et al., “Generate rather than retrieve: Large language models are strong context generators,” 2023.
  13. Z. Sun, X. Wang, Y. Tay, Y. Yang, and D. Zhou, “Recitation-augmented language models,” 2023.
  14. G. Izacard and E. Grave, “Leveraging passage retrieval with generative models for open domain question answering,” arXiv preprint arXiv:2007.01282, 2020.
  15. T. Kwiatkowski et al., “Natural questions: A benchmark for question answering research,” Transactions of the Association for Computational Linguistics, vol. 7, pp. 452–466, 2019. [Online]. Available: https://aclanthology.org/Q19-1026
  16. M. Joshi, E. Choi, D. Weld, and L. Zettlemoyer, “TriviaQA: A large scale distantly supervised challenge dataset for reading comprehension,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).   Vancouver, Canada: Association for Computational Linguistics, Jul. 2017, pp. 1601–1611. [Online]. Available: https://aclanthology.org/P17-1147
  17. X. Ho, A.-K. Duong Nguyen, S. Sugawara, and A. Aizawa, “Constructing a multi-hop QA dataset for comprehensive evaluation of reasoning steps,” in Proceedings of the 28th International Conference on Computational Linguistics.   Barcelona, Spain (Online): International Committee on Computational Linguistics, Dec. 2020, pp. 6609–6625. [Online]. Available: https://aclanthology.org/2020.coling-main.580
  18. Z. Yang et al., “HotpotQA: A dataset for diverse, explainable multi-hop question answering,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.   Brussels, Belgium: Association for Computational Linguistics, Oct.-Nov. 2018, pp. 2369–2380. [Online]. Available: https://aclanthology.org/D18-1259
  19. H. Trivedi, N. Balasubramanian, T. Khot, and A. Sabharwal, “Interleaving retrieval with chain-of-thought reasoning for knowledge-intensive multi-step questions,” arXiv preprint arXiv:2212.10509, 2022.
  20. Z. Jiang et al., “Active retrieval augmented generation,” arXiv preprint arXiv:2305.06983, 2023.
  21. L. Ouyang et al., “Training language models to follow instructions with human feedback,” Advances in Neural Information Processing Systems, vol. 35, pp. 27 730–27 744, 2022.
  22. J. Wei et al., “Chain of thought prompting elicits reasoning in large language models,” arXiv preprint arXiv:2201.11903, 2022.
  23. G. Izacard et al., “Few-shot learning with retrieval augmented language models,” arXiv preprint arXiv:2208.03299, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Zhangyin Feng (14 papers)
  2. Xiaocheng Feng (54 papers)
  3. Dezhi Zhao (1 paper)
  4. Maojin Yang (1 paper)
  5. Bing Qin (186 papers)
Citations (23)