RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation

Published 8 Mar 2024 in cs.CL and cs.AI | (2403.05313v1)

Abstract: We explore how iterative revising a chain of thoughts with the help of information retrieval significantly improves LLMs' reasoning and generation ability in long-horizon generation tasks, while hugely mitigating hallucination. In particular, the proposed method -- retrieval-augmented thoughts (RAT) -- revises each thought step one by one with retrieved information relevant to the task query, the current and the past thought steps, after the initial zero-shot CoT is generated. Applying RAT to GPT-3.5, GPT-4, and CodeLLaMA-7b substantially improves their performances on various long-horizon generation tasks; on average of relatively increasing rating scores by 13.63% on code generation, 16.96% on mathematical reasoning, 19.2% on creative writing, and 42.78% on embodied task planning. The demo page can be found at https://craftjarvis.github.io/RAT

Abstract PDF HTML Upgrade to Chat

References (67)

Citations (29)

View on Semantic Scholar

Summary

The paper introduces RAT, which combines chain-of-thought reasoning with retrieval mechanisms to address long-horizon generation challenges.
It employs iterative thought revision using dynamic external information retrieval to improve coherence and mitigate hallucinations.
Experimental results show significant performance gains, with improvements up to 20.94% in code generation and 16.96% in mathematical reasoning.

Synergizing Chain of Thoughts and Retrieval-Augmented Generation for Long-Horizon Tasks

Introduction

The increase in capabilities of LLMs has opened up new frontiers in AI's ability to process and generate natural language. However, their performance on long-horizon generation tasks tends to degrade, primarily due to issues with reasoning steps and factual correctness. A promising approach to address these challenges involves combining the concept of chain-of-thought (CoT) prompting with retrieval-augmented generation (RAG). This paper presents a novel method, Retrieval-Augmented Thoughts (RAT), that iteratively revises each thought step with relevant retrieved information to enhance reasoning and mitigate hallucination.

Methodology

The RAT approach introduces two key components to improve LLMs' ability on long-horizon tasks:

Iterative Thought Revision with RAG: An initial chain of thoughts generated by the LLM is revised iteratively. Each thought step is evaluated for potential flaws that could benefit from external information. The method retrieves relevant information using the task prompt, the current thought step under revision, and all previous revised thoughts as queries to an external knowledge source. This ensures each revision step is informed by the most relevant and up-to-date external information.
Progressive Generation: Unlike conventional methods that revise the entire thought chain in one go, RAT adopts a step-by-step revision approach. It tailors the retrieval query based on the evolving understanding of the task and past thoughts, ensuring a coherent and factually grounded thought process. This methodology is analogous to how humans refine their reasoning by seeking and incorporating new information as they progress through problem-solving steps.

Experimental Results

The RAT method's efficacy was evaluated on diverse benchmarks, including code generation, mathematical reasoning, embodied task planning, and creative writing. The experiments utilized various LLMs such as GPT-3.5, GPT-4, and CodeLLaMA-7b. Results demonstrate notable improvements across all tasks, particularly emphasizing RAT's ability to significantly enhance both reasoning and factual accuracy in generated outputs. For instance, RAT achieved substantial improvements over baseline models, including up to 20.94\% on code generation and 16.96\% on mathematical reasoning tasks, showcasing its capability to handle complex long-horizon generation tasks effectively.

Discussion and Future Directions

The success of RAT underlines the importance of integrating retrieval-augmented generation with CoT prompting to mitigate hallucination and improve reasoning in LLMs. This approach not only supports more accurate and contextually relevant generation but also highlights the potential for expanding LLMs' application in scenarios requiring deep reasoning and factual consistency.

Looking forward, the adaptability of RAT to various LLMs and tasks hints at its potential as a generalized strategy for enhancing LLM performance on long-horizon tasks. Future research could explore optimizing the retrieval process for efficiency, extending the method to more complex reasoning structures beyond linear thought chains, and refining the prompting strategy to further reduce reliance on external information retrieval while maintaining performance enhancements.

In conclusion, Retrieval-Augmented Thoughts represent a significant step forward in the pursuit of more intelligent, accurate, and reliable language generation and reasoning from LLMs, paving the way for advancements in AI's application in complex problem-solving scenarios.