ClashEval: Quantifying the tug-of-war between an LLM's internal prior and external evidence (2404.10198v2)

Published 16 Apr 2024 in cs.CL and cs.AI

Abstract: Retrieval augmented generation (RAG) is frequently used to mitigate hallucinations and provide up-to-date knowledge for LLMs. However, given that document retrieval is an imprecise task and sometimes results in erroneous or even harmful content being presented in context, this raises the question of how LLMs handle retrieved information: If the provided content is incorrect, does the model know to ignore it, or does it recapitulate the error? Conversely, when the model's initial response is incorrect, does it always know to use the retrieved information to correct itself, or does it insist on its wrong prior response? To answer this, we curate a dataset of over 1200 questions across six domains (e.g., drug dosages, Olympic records, locations) along with content relevant to answering each question. We further apply precise perturbations to the answers in the content that range from subtle to blatant errors. We benchmark six top-performing LLMs, including GPT-4o, on this dataset and find that LLMs are susceptible to adopting incorrect retrieved content, overriding their own correct prior knowledge over 60% of the time. However, the more unrealistic the retrieved content is (i.e. more deviated from truth), the less likely the model is to adopt it. Also, the less confident a model is in its initial response (via measuring token probabilities), the more likely it is to adopt the information in the retrieved content. We exploit this finding and demonstrate simple methods for improving model accuracy where there is conflicting retrieved content. Our results highlight a difficult task and benchmark for LLMs -- namely, their ability to correctly discern when it is wrong in light of correct retrieved content and to reject cases when the provided content is incorrect.

PDF Abstract

An Analytical Gaze into the Dynamics of Retrieval Augmented Generation Frameworks and LLMs

Introduction

Retrieval Augmented Generation (RAG) enhances the performance of LLMs by augmenting their outputs with information retrieved from external documents. This compensates for LLMs' limitations in handling queries requiring up-to-date information or reducing hallucinated outputs. Notably, this research systematically examines how LLMs reconcile inconsistencies between their internally stored knowledge and the external information provided by RAG when the two sources offer conflicting details.

Analyzing Model Behavior in RAG Contexts

The paper meticulously explores the interplay between an LLM's pre-existing knowledge base and external information retrieved through RAG systems. Utilizing GPT-4 along with other prominent models, the research spans over 1200 questions across six domain-specific datasets. It uncovers that LLMs exhibit a higher likelihood of adhering to retrieved information when their internal confidence in their knowledge is low. Conversely, with stronger priors, LLMs tend to resist external misinformation, highlighting a nuanced tension in integrating RAG with LLMs.

Key Findings from the Study

Model Adherence to RAG: There appears to be an inverse relationship between the model's internal confidence (prior probability) and its propensity to rely on retrieved information. Higher internal confidence correlates with decreased dependence on external data.
Influence of Data Perturbation: As the level of deviation in retrieved data from the model's prior increases, LLMs progressively favor their own knowledge over the modified external information. This relationship maintains across variations in the prior knowledge's probability, suggesting inherent robustness in LLMs against misleading RAG inputs.
Prompting Techniques and RAG Adherence: The paper further explores the impact of different prompting strategies on LLMs' RAG adherence. Results indicate that the phrasing of prompts significantly influences the model's reliance on RAG, with "strict" prompts enhancing adherence and "loose" prompts promoting skepticism towards retrieved content.

Implications and Future Directions

The nuanced understanding of LLM behavior in the context of RAG systems provides valuable insights for both theoretical exploration and practical applications of generative AI. It underscores the importance of carefully considering the model's internal confidence and the accuracy of retrieved information in designing effective RAG systems. Additionally, the observed influence of prompting strategies on RAG integration opens avenues for refining interaction paradigms between LLMs and external knowledge sources.

Moving forward, this research advocates for a more sophisticated analysis of the symbiotic relationship between LLM priors and RAG inputs. Further investigation into the mechanisms that govern this relationship could lead to advancements in model architecture, prompting strategies, and information retrieval techniques, enhancing the reliability and accuracy of LLM outputs in real-world applications.

Conclusion

This investigation into the dynamics between LLMs' internal knowledge and external information provided by RAG systems unearthed critical insights into the operational mechanisms of RAG-integrated LLMs. By scrutinizing the conditions under which LLMs either favor or disregard external inputs, the research charts a path for future advancements in the field. The observed interplay between model confidence, data deviation, and the impact of prompting techniques not only broadens our understanding of LLM capabilities but also propels us towards creating more sophisticated and reliable generative AI systems.