RARR: Researching and Revising What Language Models Say, Using Language Models (2210.08726v3)

Published 17 Oct 2022 in cs.CL, cs.AI, cs.IR, and cs.LG

Abstract: LLMs (LMs) now excel at many tasks such as few-shot learning, question answering, reasoning, and dialog. However, they sometimes generate unsupported or misleading content. A user cannot easily determine whether their outputs are trustworthy or not, because most LMs do not have any built-in mechanism for attribution to external evidence. To enable attribution while still preserving all the powerful advantages of recent generation models, we propose RARR (Retrofit Attribution using Research and Revision), a system that 1) automatically finds attribution for the output of any text generation model and 2) post-edits the output to fix unsupported content while preserving the original output as much as possible. When applied to the output of several state-of-the-art LMs on a diverse set of generation tasks, we find that RARR significantly improves attribution while otherwise preserving the original input to a much greater degree than previously explored edit models. Furthermore, the implementation of RARR requires only a handful of training examples, a LLM, and standard web search.

Citations (230)

View on Semantic Scholar

Summary

The paper presents RARR, a novel framework that edits language model outputs by verifying evidence for improved attribution.
The system employs a dual-stage process—research for evidence retrieval and revision using agreement models to correct unsupported claims.
Evaluations on benchmarks like Natural Questions and StrategyQA illustrate RARR’s effectiveness in enhancing factual accuracy with minimal stylistic impact.

Overview of RARR: Enhancing LLM Attribution

The paper "RARR: Researching and Revising What LLMs Say, Using LLMs" proposes RARR (Retrofit Attribution using Research and Revision), a system devised to address the challenges of attribution in LLMs (LMs). By determining evidence for LM outputs and post-editing unsourced or incorrect content, RARR seeks to improve the credibility of generated text.

Modern LMs such as GPT-3 and LaMDA exhibit strong performance across tasks like question answering and dialog generation. However, a critical shortcoming is their occasional generation of hallucinated or unsupported statements. This paper recognizes a gap in attribution mechanisms and contributes by introducing RARR, an automated system that operates transparently above existing LMs.

Major Contributions

Editing for Attribution Task: The authors establish the task of Editing for Attribution, defining metrics that balance the amount of attributed content with preserving original qualities post-revision. Contrary to single-dimensional strategies, the paper introduces a dual evaluation focusing on attribution and preservation, with testing leveraged on benchmarks that include structured questions, reasoning chains, and dialog responses.
RARR Framework: The system comprises two stages—Research and Revision. During research, RARR forms queries for problematic LM outputs and retrieves relevant web-based evidence. In the revision stage, it employs agreement and edit models to adjust the outputs.
Agreement and Edit Models: The system utilizes services like Google Search and few-shot learning techniques with PaLM to facilitate diverse retrievals and maintain high accuracies. Through prompting techniques, RARR applies logical reasoning to verify alignment between retrieved evidence and passage components.
Evaluation and Results: RARR showed distinct improvements in attribution with marginal impact on style or structure of initial output texts. Its evaluation employed datasets such as Natural Questions and StrategyQA, showcasing robustness across different text types and generative objectives.

Implications and Future Scope

The application of RARR can reinforce the reliability of LMs in fields where trust and data verification are essential, such as journalism, legal, and scientific content generation. The model allows LMs to draw on external sources as baselines for factual accuracy, crucial for their deployment in real-world systems where unverified information can have adverse outcomes.

On the theoretical front, RARR's methodology offers a path towards decoupling learning retrieval from text generation. This enables using pre-trained models more broadly without extensive in-domain retraining. This paradigm shift encourages further exploration into non-intrusive model frameworks that empower LMs to collaboratively work with external knowledge bases.

Conclusion

RARR marks a significant advancement by improving content reliability from generative models without extensive modifications or retraining. The system's multi-faceted approach toward retrieval and post-editing sets a precedent for enhancing model outputs with verified evidence. As LLMs continue to evolve, RARR serves as a foundation for future work focusing on attribution, ensuring that AI outputs can be traced to credible sources, thus enhancing trustworthiness in AI-driven applications. Future developments could explore automated strategies for identifying when attribution may not be necessary and refine models to account for subjective or contextually varying content.

PDF Markdown

Related Papers

Tweets

https://twitter.com/BatAndrew314/status/1751708210493304935