ERAGent: Enhancing Retrieval-Augmented Language Models with Improved Accuracy, Efficiency, and Personalization (2405.06683v1)
Abstract: Retrieval-augmented generation (RAG) for LLMs significantly improves language understanding systems. The basic retrieval-then-read pipeline of response generation has evolved into a more extended process due to the integration of various components, sometimes even forming loop structures. Despite its advancements in improving response accuracy, challenges like poor retrieval quality for complex questions that require the search of multifaceted semantic information, inefficiencies in knowledge re-retrieval during long-term serving, and lack of personalized responses persist. Motivated by transcending these limitations, we introduce ERAGent, a cutting-edge framework that embodies an advancement in the RAG area. Our contribution is the introduction of the synergistically operated module: Enhanced Question Rewriter and Knowledge Filter, for better retrieval quality. Retrieval Trigger is incorporated to curtail extraneous external knowledge retrieval without sacrificing response quality. ERAGent also personalizes responses by incorporating a learned user profile. The efficiency and personalization characteristics of ERAGent are supported by the Experiential Learner module which makes the AI assistant being capable of expanding its knowledge and modeling user profile incrementally. Rigorous evaluations across six datasets and three question-answering tasks prove ERAGent's superior accuracy, efficiency, and personalization, emphasizing its potential to advance the RAG field and its applicability in practical systems.
- L-eval: Instituting standardized evaluation for long context language models.
- Improving language models by retrieving from trillions of tokens. In International conference on machine learning, pages 2206–2240. PMLR.
- Language models are few-shot learners.
- Reading wikipedia to answer open-domain questions. arXiv preprint arXiv:1704.00051.
- Palm: Scaling language modeling with pathways.
- Shortcut learning of large language models in natural language understanding.
- RARR: Researching and revising what language models say, using language models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16477–16508, Toronto, Canada. Association for Computational Linguistics.
- Retrieval-augmented generation for large language models: A survey.
- Interleaving pre-trained language models and large language models for zero-shot nl2sql generation.
- Lora: Low-rank adaptation of large language models.
- Few-shot learning with retrieval augmented language models. arXiv preprint arXiv:2208.03299.
- Conversation chronicles: Towards diverse temporal and relational dynamics in multi-session conversations. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 13584–13606, Singapore. Association for Computational Linguistics.
- Llmlingua: Compressing prompts for accelerated inference of large language models.
- Dense passage retrieval for open-domain question answering.
- Tree of clarifications: Answering ambiguous questions with retrieval-augmented large language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 996–1009, Singapore. Association for Computational Linguistics.
- Natural questions: A benchmark for question answering research. Transactions of the Association for Computational Linguistics, 7:452–466.
- Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33:9459–9474.
- From classification to generation: Insights into crosslingual retrieval augmented icl.
- Reta-llm: A retrieval-augmented large language model toolkit. arXiv preprint arXiv:2306.05212.
- Chatent: Augmented large language model for expert knowledge retrieval in otolaryngology-head and neck surgery. medRxiv, pages 2023–08.
- Clinfo. ai: An open-source retrieval-augmented large language model system for answering medical questions using scientific literature. arXiv preprint arXiv:2310.16146.
- Query rewriting for retrieval-augmented large language models.
- When not to trust language models: Investigating effectiveness of parametric and non-parametric memories. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 9802–9822.
- AmbigQA: Answering ambiguous open-domain questions. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 5783–5797, Online. Association for Computational Linguistics.
- The refinedweb dataset for falcon llm: Outperforming curated corpora with web data, and web data only.
- In-context retrieval-augmented language models.
- Paul Röttger and Janet Pierrehumbert. 2021. Temporal adaptation of BERT and performance on downstream document classification: Insights from social media. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 2400–2412, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Explaining legal concepts with augmented large language models (gpt-4). arXiv preprint arXiv:2306.09525.
- Enhancing retrieval-augmented large language models with iterative retrieval-generation synergy. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 9248–9274, Singapore. Association for Computational Linguistics.
- Replug: Retrieval-augmented black-box language models.
- Augmenting black-box llms with medical textbooks for clinical question answering. arXiv preprint arXiv:2309.02233.
- Constructing datasets for multi-hop reading comprehension across documents. Transactions of the Association for Computational Linguistics, 6:287–302.
- HotpotQA: A dataset for diverse, explainable multi-hop question answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2369–2380, Brussels, Belgium. Association for Computational Linguistics.
- React: Synergizing reasoning and acting in language models.
- Answering questions by meta-reasoning over multiple chains of thought. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 5942–5966, Singapore. Association for Computational Linguistics.
- Making retrieval-augmented language models robust to irrelevant context.
- Almanac: Retrieval-augmented language models for clinical medicine. Research Square.
- A multimodal foundation agent for financial trading: Tool-augmented, diversified, and generalist.
- Judging llm-as-a-judge with mt-bench and chatbot arena. Advances in Neural Information Processing Systems, 36.
- Yunxiao Shi (20 papers)
- Xing Zi (6 papers)
- Zijing Shi (7 papers)
- Haimin Zhang (29 papers)
- Qiang Wu (154 papers)
- Min Xu (169 papers)