Embedding-Informed Adaptive Retrieval-Augmented Generation of Large Language Models (2404.03514v2)

Published 4 Apr 2024 in cs.CL and cs.AI

Abstract: Retrieval-augmented LLMs have been remarkably competent in various NLP tasks. However, it was observed by previous works that retrieval is not always helpful, especially when the LLM is already knowledgeable on the query to answer. Motivated by this, Adaptive Retrieval-Augmented Generation (ARAG) studies retrieving only when the knowledge asked by the query is absent in the LLM. Previous works of ARAG either require accessing the pre-training corpus or prompting with additional model inferences. Aiming to avoid such drawbacks, we propose to determine whether the model is knowledgeable on a query via inspecting the (contextualized) pre-trained token embeddings of LLMs. We hypothesize that such embeddings capture rich information on the model's intrinsic knowledge base, which enables an efficient way of judging the necessity to retrieve from an external corpus. Extensive experiments demonstrate our ARAG approach's superior performance across various benchmarks.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (20)

Authors (7)

Chengkai Huang (13 papers)
Rui Wang (996 papers)
Kaige Xie (11 papers)
Tong Yu (119 papers)
Lina Yao (194 papers)
Yu Xia (65 papers)
Julian McAuley (238 papers)

Citations (3)

View on Semantic Scholar

Embedding-Informed Adaptive Retrieval-Augmented Generation of Large Language Models (2404.03514v2)

Related Papers