Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MathChat: Converse to Tackle Challenging Math Problems with LLM Agents (2306.01337v3)

Published 2 Jun 2023 in cs.CL and stat.ML

Abstract: Employing LLMs to address mathematical problems is an intriguing research endeavor, considering the abundance of math problems expressed in natural language across numerous science and engineering fields. LLMs, with their generalized ability, are used as a foundation model to build AI agents for different tasks. In this paper, we study the effectiveness of utilizing LLM agents to solve math problems through conversations. We propose MathChat, a conversational problem-solving framework designed for math problems. MathChat consists of an LLM agent and a user proxy agent which is responsible for tool execution and additional guidance. This synergy facilitates a collaborative problem-solving process, where the agents engage in a dialogue to solve the problems. We perform evaluation on difficult high school competition problems from the MATH dataset. Utilizing Python, we show that MathChat can further improve previous tool-using prompting methods by 6%.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Yiran Wu (12 papers)
  2. Feiran Jia (9 papers)
  3. Shaokun Zhang (15 papers)
  4. Hangyu Li (23 papers)
  5. Erkang Zhu (8 papers)
  6. Yue Wang (676 papers)
  7. Yin Tat Lee (102 papers)
  8. Richard Peng (87 papers)
  9. Qingyun Wu (47 papers)
  10. Chi Wang (93 papers)
Citations (41)
Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com