Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented Generation (2404.12879v1)
Abstract: While Retrieval-Augmented Generation (RAG) plays a crucial role in the application of LLMs, existing retrieval methods in knowledge-dense domains like law and medicine still suffer from a lack of multi-perspective views, which are essential for improving interpretability and reliability. Previous research on multi-view retrieval often focused solely on different semantic forms of queries, neglecting the expression of specific domain knowledge perspectives. This paper introduces a novel multi-view RAG framework, MVRAG, tailored for knowledge-dense domains that utilizes intention-aware query rewriting from multiple domain viewpoints to enhance retrieval precision, thereby improving the effectiveness of the final inference. Experiments conducted on legal and medical case retrieval demonstrate significant improvements in recall and precision rates with our framework. Our multi-perspective retrieval approach unleashes the potential of multi-view information enhancing RAG tasks, accelerating the further application of LLMs in knowledge-intensive fields.
- Chatlaw: Open-source legal large language model with integrated external knowledge bases. arXiv preprint arXiv:2306.16092.
- Ragas: Automated evaluation of retrieval augmented generation. arXiv preprint arXiv:2309.15217.
- Precise zero-shot dense retrieval without relevance labels, 2022. URL https://arxiv. org/abs/2212.10496.
- Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997.
- Retrieval augmented language model pre-training. In International conference on machine learning, pages 3929–3938. PMLR.
- Retrieving supporting evidence for llms generated answers. arXiv preprint arXiv:2306.13781.
- Atlas: Few-shot learning with retrieval augmented language models. arXiv preprint arXiv:2208.03299.
- Relevance-guided supervision for openqa with colbert. Transactions of the association for computational linguistics, 9:929–944.
- Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33:9459–9474.
- Lecardv2: A large-scale chinese legal case retrieval dataset. arXiv preprint arXiv:2310.17609.
- Query rewriting for retrieval-augmented large language models. arXiv preprint arXiv:2305.14283.
- In-context retrieval-augmented language models. Transactions of the Association for Computational Linguistics, 11:1316–1331.
- Ares: An automated evaluation framework for retrieval-augmented generation systems. arXiv preprint arXiv:2311.09476.
- Retrieval augmentation reduces hallucination in conversation. arXiv preprint arXiv:2104.07567.
- Query2doc: Query expansion with large language models. arXiv preprint arXiv:2303.07678.
- Towards open-world recommendation with knowledge augmentation from large language models. arXiv preprint arXiv:2306.10933.
- Knowledge plugins: Enhancing large language models for domain-specific recommendations. arXiv preprint arXiv:2311.10779.
- A survey of large language models. arXiv preprint arXiv:2303.18223.
- Pmc-patients: A large-scale dataset of patient summaries and relations for benchmarking retrieval-based clinical decision support systems. arXiv preprint arXiv:2202.13876.
- Large language models for information retrieval: A survey. arXiv preprint arXiv:2308.07107.