Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context Reasoning with Language Models (2306.06891v1)

Published 12 Jun 2023 in cs.CL and cs.AI

Abstract: Generating intermediate steps, or Chain of Thought (CoT), is an effective way to significantly improve LLMs' (LM) multi-step reasoning capability. However, the CoT lengths can grow rapidly with the problem complexity, easily exceeding the maximum context size. Instead of increasing the context limit, which has already been heavily investigated, we explore an orthogonal direction: making LMs divide a problem into multiple contexts. We propose a new inference framework, called Recursion of Thought (RoT), which introduces several special tokens that the models can output to trigger context-related operations. Extensive experiments with multiple architectures including GPT-3 show that RoT dramatically improves LMs' inference capability to solve problems, whose solution consists of hundreds of thousands of tokens.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (20)
  1. Making neural programming architectures generalize via recursion. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net.
  2. PaLM: Scaling language modeling with pathways. ArXiv, abs/2204.02311.
  3. Successive prompting for decomposing complex questions. ArXiv, abs/2212.04092.
  4. Parametrized hierarchical procedures for neural programming. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net.
  5. Michael Hahn. 2020. Theoretical limitations of self-attention in neural sequence models. Transactions of the Association for Computational Linguistics, 8:156–171.
  6. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation, 9:1735–1780.
  7. Decomposed prompting: A modular approach for solving complex tasks. ArXiv, abs/2210.02406.
  8. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
  9. Large language models are zero-shot reasoners. ArXiv, abs/2205.11916.
  10. Solving quantitative reasoning problems with language models. ArXiv, abs/2206.14858.
  11. Neural program lattices. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net.
  12. Show your work: Scratchpads for intermediate computation with language models. ArXiv, abs/2112.00114.
  13. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc.
  14. Learning compositional neural programs with recursive tree search and planning. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 14646–14656.
  15. Scott E. Reed and Nando de Freitas. 2016. Neural programmer-interpreters. In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings.
  16. Long range arena: A benchmark for efficient transformers. ArXiv, abs/2011.04006.
  17. Efficient transformers: A survey. ACM Computing Surveys, 55:1 – 28.
  18. Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pages 5998–6008.
  19. Chain of thought prompting elicits reasoning in large language models. ArXiv, abs/2201.11903.
  20. Least-to-most prompting enables complex reasoning in large language models. ArXiv, abs/2205.10625.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Soochan Lee (7 papers)
  2. Gunhee Kim (74 papers)
Citations (21)

Summary

Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context Reasoning with LLMs

The paper, "Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context Reasoning with LLMs," authored by Soochan Lee and Gunhee Kim from Seoul National University and SNU-LG AI Research Center, explores an innovative methodology for enhancing multi-context reasoning by leveraging recursion within LLMs. This research delves deeply into recursive problem-solving architectures applied to LLMs, providing a robust framework for improving reasoning tasks with multiple contexts.

Methodology

The authors introduce a recursive strategy that divides complex reasoning problems into smaller, manageable sub-problems, systematically solving each sub-problem using LLMs. The approach is inspired by traditional divide-and-conquer algorithms but is adapted specifically for language-based tasks. Through this strategy, the paper posits that LLMs can be effectively utilized to understand and solve tasks requiring more than a simple linear reasoning process.

Significant Findings

  1. Recursive Architecture Implementation: The paper describes the formulation of recursive reasoning, showing how LLMs can recursively interact with multiple layers of reasoning to conclude a given task. The recursive process allows for iterative refinement of context understanding, which results in improved accuracy in reasoning tasks.
  2. Performance Metrics: The findings reveal substantial improvements over standard sequential reasoning models in handling multi-context problems. Particularly, the recursive model showcases an enhanced ability to integrate and reason across diverse informational contexts compared to conventional methodologies.
  3. Scalability and Adaptability: The authors address scalability issues by demonstrating that recursive processes can adapt to large-scale and intricate language tasks without a substantial increase in computational expense. This attribute makes the proposed methodology a viable option for broader application in real-world linguistic challenges.

Implications

The research provides a compelling case for shifting towards recursive strategies in NLP, particularly in areas demanding intricate reasoning across multiple contexts. The proposed solutions open avenues for more nuanced AI applications, fundamentally altering how LLMs approach complex reasoning tasks.

Speculation on Future Developments

Future advancements in AI could iterate on the recursive methodologies proposed, potentially leading to more autonomous reasoning systems capable of tackling unscripted and complex problem domains across diverse applications. This paper serves as a foundational step, encouraging further exploration into recursive architectures, ultimately aiming for higher cognitive capabilities within AI systems.

In conclusion, "Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context Reasoning with LLMs" significantly contributes to the ongoing discourse in NLP regarding recursive methodologies. By demonstrating practical enhancements in reasoning across multiple contexts using divide-and-conquer principles, this research sets a precedent for future studies to build upon and refine the cognitive capabilities of AI systems.

X Twitter Logo Streamline Icon: https://streamlinehq.com