Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LPML: LLM-Prompting Markup Language for Mathematical Reasoning (2309.13078v2)

Published 21 Sep 2023 in cs.AI, cs.LG, and cs.PL

Abstract: In utilizing LLMs for mathematical reasoning, addressing the errors in the reasoning and calculation present in the generated text by LLMs is a crucial challenge. In this paper, we propose a novel framework that integrates the Chain-of-Thought (CoT) method with an external tool (Python REPL). We discovered that by prompting LLMs to generate structured text in XML-like markup language, we could seamlessly integrate CoT and the external tool and control the undesired behaviors of LLMs. With our approach, LLMs can utilize Python computation to rectify errors within CoT. We applied our method to ChatGPT (GPT-3.5) to solve challenging mathematical problems and demonstrated that combining CoT and Python REPL through the markup language enhances the reasoning capability of LLMs. Our approach enables LLMs to write the markup language and perform advanced mathematical reasoning using only zero-shot prompting.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. Language models are few-shot learners. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.
  2. Chain-of-thought prompting elicits reasoning in large language models. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 24824–24837. Curran Associates, Inc., 2022. URL https://proceedings.neurips.cc/paper_files/paper/2022/file/9d5609613524ecf4f15af0f7b31abca4-Paper-Conference.pdf.
  3. Measuring mathematical problem solving with the math dataset. NeurIPS, 2021.
  4. OpenAI. Gpt-4 technical report, 2023.
  5. Solving quantitative reasoning problems with language models. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 3843–3857. Curran Associates, Inc., 2022. URL https://proceedings.neurips.cc/paper_files/paper/2022/file/18abbeef8cfe9203fdf9053c9c4fe191-Paper-Conference.pdf.
  6. Scaling instruction-finetuned language models, 2022.
  7. Least-to-most prompting enables complex reasoning in large language models, 2023.
  8. Show your work: Scratchpads for intermediate computation with language models, 2021.
  9. Self-consistency improves chain of thought reasoning in language models, 2023.
  10. Pal: Program-aided language models. arXiv preprint arXiv:2211.10435, 2022.
  11. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629, 2022.
  12. Toolformer: Language models can teach themselves to use tools, 2023.
  13. Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588, 2022.
  14. LILA: A unified benchmark for mathematical reasoning. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 5807–5832, Abu Dhabi, United Arab Emirates, December 2022. Association for Computational Linguistics. URL https://aclanthology.org/2022.emnlp-main.392.
  15. OpenAI. OpenAI: Introducing ChatGPT. https://openai.com/blog/chatgpt, 2022.
  16. Large language models are zero-shot reasoners. In Advances in Neural Information Processing Systems, volume 35, pages 22199–22213, 2022.
  17. Training verifiers to solve math word problems, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Ryutaro Yamauchi (3 papers)
  2. Sho Sonoda (26 papers)
  3. Akiyoshi Sannai (27 papers)
  4. Wataru Kumagai (21 papers)
Citations (12)

Summary

We haven't generated a summary for this paper yet.