Emergent Mind


As large language models (LLMs) have shown effectiveness with different prompting methods, such as Chain of Thought, Program of Thought, we find that these methods have formed a great complementarity to each other on math reasoning tasks. In this work, we propose XoT, an integrated problem solving framework by prompting LLMs with diverse reasoning thoughts. For each question, XoT always begins with selecting the most suitable method then executes each method iteratively. Within each iteration, XoT actively checks the validity of the generated answer and incorporates the feedback from external executors, allowing it to dynamically switch among different prompting methods. Through extensive experiments on 10 popular math reasoning datasets, we demonstrate the effectiveness of our proposed approach and thoroughly analyze the strengths of each module. Moreover, empirical results suggest that our framework is orthogonal to recent work that makes improvements on single reasoning methods and can further generalise to logical reasoning domain. By allowing method switching, XoT provides a fresh perspective on the collaborative integration of diverse reasoning thoughts in a unified framework. The code is available at https://github.com/tengxiaoliu/XoT.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a detailed summary of this paper with a premium account.

We ran into a problem analyzing this paper.

Please try again later (sorry!).

Get summaries of trending AI papers delivered straight to your inbox

Unsubscribe anytime.

  1. Language models are few-shot learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual.
  2. Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks
  3. Teaching Large Language Models to Self-Debug
  4. PaLM: Scaling Language Modeling with Pathways
  5. Training Verifiers to Solve Math Word Problems
  6. Edward A. Feigenbaum and Julian Feldman. 1963. Computers and thought.
  7. Complexity-Based Prompting for Multi-Step Reasoning
  8. PAL: Program-aided Language Models
  9. FOLIO: Natural Language Reasoning with First-Order Logic
  10. Solving Math Word Problems by Combining Language Models With Symbolic Solvers
  11. Measuring mathematical problem solving with the MATH dataset. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, NeurIPS Datasets and Benchmarks 2021, December 2021, virtual.
  12. Carl Hewitt. 1969. PLANNER: A language for proving theorems in robots. In Proceedings of the 1st International Joint Conference on Artificial Intelligence, Washington, DC, USA, May 7-9, 1969, pages 295–302. William Kaufmann.
  13. Learning to solve arithmetic word problems with verb categorization. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pages 523–533. ACL.
  14. MathPrompter: Mathematical Reasoning using Large Language Models
  15. Large language models are zero-shot reasoners. In NeurIPS.
  16. Parsing algebraic word problems into equations. Trans. Assoc. Comput. Linguistics, 3:585–597.
  17. MAWPS: A math word problem repository. In NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12-17, 2016, pages 1152–1157. The Association for Computational Linguistics.
  18. Coderl: Mastering code generation through pretrained models and deep reinforcement learning. In NeurIPS.
  19. Solving quantitative reasoning problems with language models. In NeurIPS.
  20. Making Large Language Models Better Reasoners with Step-Aware Verifier
  21. Program induction by rationale generation: Learning to solve and explain algebraic word problems. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, pages 158–167. Association for Computational Linguistics.
  22. Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models
  23. A Survey of Deep Learning for Mathematical Reasoning
  24. Self-Refine: Iterative Refinement with Self-Feedback
  25. GPT-4 Technical Report
  26. Logic-lm: Empowering large language models with symbolic solvers for faithful logical reasoning. In Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023, pages 3806–3824. Association for Computational Linguistics.
  27. Are NLP models really able to solve simple math word problems? In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online, June 6-11, 2021, pages 2080–2094. Association for Computational Linguistics.
  28. REFINER: Reasoning Feedback on Intermediate Representations
  29. The Art of SOCRATIC QUESTIONING: Recursive Thinking with Large Language Models
  30. Reasoning with Language Model Prompting: A Survey
  31. Subhro Roy and Dan Roth. 2015. Solving general arithmetic word problems. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015, pages 1743–1752. The Association for Computational Linguistics.
  32. Reasoning about quantities in natural language. Trans. Assoc. Comput. Linguistics, 3:1–13.
  33. Reflexion: Language Agents with Verbal Reinforcement Learning
  34. Artificial General Intelligence - 9th International Conference, AGI 2016, New York, NY, USA, July 16-19, 2016, Proceedings, volume 9782 of Lecture Notes in Computer Science. Springer.
  35. LLaMA: Open and Efficient Foundation Language Models
  36. Llama 2: Open Foundation and Fine-Tuned Chat Models
  37. Learning from Mistakes via Cooperative Study Assistant for Large Language Models
  38. Self-Consistency Improves Chain of Thought Reasoning in Language Models
  39. Deep neural solver for math word problems. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017, pages 845–854. Association for Computational Linguistics.
  40. Chain-of-thought prompting elicits reasoning in large language models. In NeurIPS.
  41. Tree of Thoughts: Deliberate Problem Solving with Large Language Models
  42. Automatic Model Selection with Large Language Models for Reasoning
  43. Progressive-Hint Prompting Improves Reasoning in Large Language Models
  44. Least-to-Most Prompting Enables Complex Reasoning in Large Language Models

Show All 44

Test Your Knowledge

You answered out of questions correctly.

Well done!