MathVC: An LLM-Simulated Multi-Character Virtual Classroom for Mathematics Education (2404.06711v1)
Abstract: Mathematical modeling (MM) is considered a fundamental skill for students in STEM disciplines. Practicing the MM skill is often the most effective when students can engage in group discussion and collaborative problem-solving. However, due to unevenly distributed teachers and educational resources needed to monitor such group activities, students do not always receive equal opportunities for this practice. Excitingly, LLMs have recently demonstrated strong capability in both modeling mathematical problems and simulating characters with different traits and properties. Drawing inspiration from the advancement of LLMs, in this work, we present MATHVC, the very first LLM-powered virtual classroom containing multiple LLM-simulated student characters, with whom a human student can practice their MM skill. To encourage each LLM character's behaviors to be aligned with their specified math-relevant properties (termed "characteristics alignment") and the overall conversational procedure to be close to an authentic student MM discussion (termed "conversational procedural alignment"), we proposed three innovations: integrating MM domain knowledge into the simulation, defining a symbolic schema as the ground for character simulation, and designing a meta planner at the platform level to drive the conversational procedure. Through experiments and ablation studies, we confirmed the effectiveness of our simulation approach and showed the promise for MATHVC to benefit real-life students in the future.
- Large language models for mathematical reasoning: Progresses and challenges, 2024.
- An automated graphing system for mathematical pedagogy. In NeurIPS’23 Workshop on Generative AI for Education (GAIED), 2023.
- Catalyzing change in middle school mathematics: Initiating critical conversations. 2020.
- Sociodojo: Building lifelong analytical agents with real-world text and time series. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=s9z0HzWJJp.
- Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168, 2021.
- Generative ai for education (gaied): Advances, opportunities, and challenges, 2024.
- Kevin A. Fischer. Reflective linguistic programming (rlp): A stepping stone in socially-aware agi (socialagi), 2023.
- GAIMME—Guidelines for Assessment & Instruction in Mathematical Modeling Education. SIAM, 2019.
- Metagpt: Meta programming for a multi-agent collaborative framework, 2023.
- War and peace (waragent): Large language model-based multi-agent simulation of world wars, 2024.
- Generative agent for teacher training: Designing educational problem-solving simulations with large language model-based agents for pre-service teachers. 2023. URL https://api.semanticscholar.org/CorpusID:266874743.
- Retrieval-augmented generation to improve math question-answering: Trade-offs between groundedness and human preference, 2023.
- Large language models understand and can be enhanced by emotional stimuli, 2023a.
- A user simulator for task-completion dialogues. arXiv preprint arXiv:1612.05688, 2016.
- Metaagents: Simulating interactions of human behaviors for llm-based task-oriented coordination via collaborative generative agents, 2023b.
- Training socially aligned language models in simulated human society, 2023.
- A survey of deep learning for mathematical reasoning. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (eds.), Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 14605–14631, Toronto, Canada, July 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.acl-long.817. URL https://aclanthology.org/2023.acl-long.817.
- MathDial: A dialogue tutoring dataset with rich pedagogical properties grounded in math reasoning problems. In Houda Bouamor, Juan Pino, and Kalika Bali (eds.), Findings of the Association for Computational Linguistics: EMNLP 2023, pp. 5602–5621, Singapore, December 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.findings-emnlp.372. URL https://aclanthology.org/2023.findings-emnlp.372.
- MAP. Mathematics Assessment Project, n.d. Accessed March 2024. https://www.map.mathshell.org/.
- Gpteach: Interactive ta training with gpt-based students. Proceedings of the Tenth ACM Conference on Learning @ Scale, 2023. URL https://api.semanticscholar.org/CorpusID:259949836.
- Automated distractor and feedback generation for math multiple-choice questions via in-context learning, 2024.
- Gpt-4 technical report, 2024.
- Training language models to follow instructions with human feedback. Advances in neural information processing systems, 35:27730–27744, 2022.
- Generative agents: Interactive simulacra of human behavior. In In the 36th Annual ACM Symposium on User Interface Software and Technology (UIST ’23), UIST ’23, New York, NY, USA, 2023. Association for Computing Machinery.
- OECD PISA. assessment and analytical framework. 2019, 2018.
- Communicative agents for software development, 2023.
- Character-llm: A trainable agent for role-playing, 2023.
- Common Core State Standards. Common core state standards for mathematics, national governors association center for best practices and the council of chief state school officers. Washington, DC, 2010.
- Dialogue act modeling for automatic tagging and recognition of conversational speech. Computational linguistics, 26(3):339–373, 2000.
- Enhancing role-playing systems through aggressive queries: Evaluation and improvement, 2024.
- TNTP. The Opportunity Myth: What Students Can Show Us About How School Is Letting Them Down—And How to Fix It. 2018.
- Mind in society: Development of higher psychological processes. Harvard university press, 1978.
- Large language models cannot replace human participants because they cannot portray identity groups, 2024.
- Voyager: An open-ended embodied agent with large language models. arXiv preprint arXiv: Arxiv-2305.16291, 2023a.
- A survey on large language model based autonomous agents. arXiv preprint arXiv:2308.11432, 2023b.
- Humanoid agents: Platform for simulating human-like generative agents. In Yansong Feng and Els Lefever (eds.), Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 167–176, Singapore, December 2023c. Association for Computational Linguistics. doi: 10.18653/v1/2023.emnlp-demo.15. URL https://aclanthology.org/2023.emnlp-demo.15.
- Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35:24824–24837, 2022.
- Wikipedia. Alice and bob, n.d. Accessed March 2024. https://en.wikipedia.org/wiki/Alice_and_Bob.
- Gentopia.AI: A collaborative platform for tool-augmented LLMs. In Yansong Feng and Els Lefever (eds.), Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 237–245, Singapore, December 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.emnlp-demo.20. URL https://aclanthology.org/2023.emnlp-demo.20.
- Xdai: A tuning-free framework for exploiting pre-trained language models in knowledge grounded dialogue generation. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’22, pp. 4422–4432, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450393850. doi: 10.1145/3534678.3539135. URL https://doi.org/10.1145/3534678.3539135.
- Neeko: Leveraging dynamic lora for efficient multi-character role-playing agent, 2024.
- Learning discourse-level diversity for neural dialog models using conditional variational autoencoders. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 654–664, 2017.
- Characterglm: Customizing chinese conversational ai characters with large language models. arXiv preprint arXiv:2311.16832, 2023.