Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models (2406.04271v2)

Published 6 Jun 2024 in cs.CL

Abstract: We introduce Buffer of Thoughts (BoT), a novel and versatile thought-augmented reasoning approach for enhancing accuracy, efficiency and robustness of LLMs. Specifically, we propose meta-buffer to store a series of informative high-level thoughts, namely thought-template, distilled from the problem-solving processes across various tasks. Then for each problem, we retrieve a relevant thought-template and adaptively instantiate it with specific reasoning structures to conduct efficient reasoning. To guarantee the scalability and stability, we further propose buffer-manager to dynamically update the meta-buffer, thus enhancing the capacity of meta-buffer as more tasks are solved. We conduct extensive experiments on 10 challenging reasoning-intensive tasks, and achieve significant performance improvements over previous SOTA methods: 11% on Game of 24, 20% on Geometric Shapes and 51% on Checkmate-in-One. Further analysis demonstrate the superior generalization ability and model robustness of our BoT, while requiring only 12% of the cost of multi-query prompting methods (e.g., tree/graph of thoughts) on average. Notably, we find that our Llama3-8B+BoT has the potential to surpass Llama3-70B model. Our project is available at: https://github.com/YangLing0818/buffer-of-thought-LLM

PDF HTML Abstract

Buffer of Thoughts: An Overview of Thought-Augmented Reasoning with LLMs

The paper "Buffer of Thoughts: Thought-Augmented Reasoning with LLMs" introduces an advanced method for augmenting the reasoning capabilities of LLMs by utilizing a concept called "Buffer of Thoughts" (BoT). This model presents a novel framework aimed at enhancing the accuracy, efficiency, and robustness of LLMs through a dynamic retrieval and adaptation mechanism of high-level thought-templates.

Key Contributions

The proposed BoT framework comprises several key components: meta-buffer, buffer-manager, problem distiller, and instantiated reasoning. Together, these components work cohesively to provide a dynamic and adaptable reasoning process that leverages past problem-solving experiences.

Meta-Buffer: A lightweight library storing a collection of high-level thought-templates distilled from previous tasks. These templates encapsulate generalized solutions and reasoning structures that can be adapted to address new problems.
Buffer-Manager: A dynamic module responsible for updating the meta-buffer by distilling new high-level thought-templates from ongoing problem-solving activities. This ensures that the meta-buffer grows in its capacity and effectiveness over time.
Problem Distiller: A preprocessing module designed to extract and formalize critical task-specific information and constraints from the input query. This distilled information aids in identifying relevant thought-templates from the meta-buffer.
Instantiated Reasoning: The process of adapting and applying a selected thought-template to the task at hand, thus performing the reasoning process more efficiently and accurately.

Empirical Validation

Extensive experiments were conducted on ten reasoning-intensive tasks to evaluate the BoT's performance. The results evidenced significant performance improvements across various benchmarks:

Game of 24: 11% improvement over the previous state-of-the-art (SOTA) methods.
Geometric Shapes: 20% improvement.
Checkmate-in-One: 51% improvement.

BoT demonstrated notable superiority in generalization and model robustness while only incurring 12% of the computational costs compared to multi-query prompting methods such as Tree-of-Thoughts (ToT) and Graph-of-Thoughts (GoT), showcasing the cost-efficiency of the proposed framework.

Analysis and Implications

Accuracy Improvement

BoT's implementation significantly boosts reasoning accuracy by utilizing thought-templates, thereby eliminating the need to construct new reasoning structures from scratch for each problem. The modular thought-templates ensure that LLMs can rely on established high-level ideas to solve similar tasks effectively.

Efficiency and Resource Management

The introduction of meta-buffer allows BoT to avoid redundant computations typically associated with recursive and heuristic search strategies of multi-query methods. By storing distilled thought-templates, LLMs can swiftly adapt pre-existing solutions to new problems, leading to enhanced reasoning efficiency.

Model Robustness

The framework emulates human reasoning by retrieving and instantiating relevant thought-templates for new tasks, which leads to consistent problem-solving strategies across similar tasks. This consistent application of high-level guidelines contributes to the robustness of BoT.

Future Directions

Despite the evident advantages, there are certain areas that warrant future research and development. For instance, the current model may require further refinements to handle tasks requiring human-like creativity adequately. Additionally, optimizing the distillation process to enhance the quality of thought-templates could further improve the model's performance on complex problems.

There is a potential for integrating external resources to build a more comprehensive open-domain reasoning system akin to agent models. Another prospective direction is making thought-template distillation optimizable, which could significantly enhance template quality and reasoning comprehensiveness.

Conclusion

The Buffer of Thoughts introduces a substantial advancement in thought-augmented reasoning for LLMs, providing a systematic approach to leverage past problem-solving experiences and optimize future task resolutions. Through its innovative framework and validated efficacy, BoT stands as a promising development for improving reasoning accuracy, efficiency, and robustness in large-scale AI models. This work opens up new avenues for further research in the continual improvement of AI reasoning capabilities.