Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Code Soliloquies for Accurate Calculations in Large Language Models (2309.12161v2)

Published 21 Sep 2023 in cs.CL

Abstract: High-quality conversational datasets are crucial for the successful development of Intelligent Tutoring Systems (ITS) that utilize a LLM backend. Synthetic student-teacher dialogues, generated using advanced GPT-4 models, are a common strategy for creating these datasets. However, subjects like physics that entail complex calculations pose a challenge. While GPT-4 presents impressive language processing capabilities, its limitations in fundamental mathematical reasoning curtail its efficacy for such subjects. To tackle this limitation, we introduce in this paper an innovative stateful prompt design. Our design orchestrates a mock conversation where both student and tutorbot roles are simulated by GPT-4. Each student response triggers an internal monologue, or `code soliloquy' in the GPT-tutorbot, which assesses whether its subsequent response would necessitate calculations. If a calculation is deemed necessary, it scripts the relevant Python code and uses the Python output to construct a response to the student. Our approach notably enhances the quality of synthetic conversation datasets, especially for subjects that are calculation-intensive. Our preliminary Subject Matter Expert evaluations reveal that our Higgs model, a fine-tuned LLaMA model, effectively uses Python for computations, which significantly enhances the accuracy and computational reliability of Higgs' responses. Code, models, and datasets is available at https://github.com/luffycodes/Tutorbot-Spock-Phys.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Shashank Sonkar (21 papers)
  2. MyCo Le (2 papers)
  3. Xinghe Chen (6 papers)
  4. Naiming Liu (22 papers)
  5. Debshila Basu Mallick (3 papers)
  6. Richard G. Baraniuk (141 papers)
Citations (11)
X Twitter Logo Streamline Icon: https://streamlinehq.com