2000 character limit reached
Three Questions Concerning the Use of Large Language Models to Facilitate Mathematics Learning (2310.13615v1)
Published 20 Oct 2023 in cs.CL
Abstract: Due to the remarkable language understanding and generation abilities of LLMs, their use in educational applications has been explored. However, little work has been done on investigating the pedagogical ability of LLMs in helping students to learn mathematics. In this position paper, we discuss the challenges associated with employing LLMs to enhance students' mathematical problem-solving skills by providing adaptive feedback. Apart from generating the wrong reasoning processes, LLMs can misinterpret the meaning of the question, and also exhibit difficulty in understanding the given questions' rationales when attempting to correct students' answers. Three research questions are formulated.
- Understanding the challenges of e-learning during the global pandemic emergency: The students’ perspective. Quality Assurance in Education, 29(2/3):259–276.
- Ammar Y Alqahtani and Albraa A Rajkhan. 2020. E-learning critical success factors during the COVID-19 pandemic: A comprehensive analysis of e-learning managerial perspectives. Education Sciences, 10(9):216.
- MathQA: Towards interpretable math word problem solving with operation-based formalisms. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2357–2367.
- Machine learning based feedback on textual student answers in large courses. Computers and Education: Artificial Intelligence, 3:100081.
- Ali Borji. 2023. A categorical archive of ChatGPT failures. arXiv preprint arXiv:2302.03494.
- Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc.
- Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168.
- PAL: Program-aided language models. arXiv preprint arXiv:2211.10435.
- State-of-the-art generalisation research in NLP: a taxonomy and review. CoRR.
- Few-shot learning with retrieval augmented language models. arXiv preprint arXiv:2208.03299.
- Assessing the challenges of e-learning in Malaysia during the pandemic of COVID-19 using the geo-spatial approach. Scientific Reports, 12(1):17316.
- Learning to reason deductively: Math word problem solving as complex relation extraction. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5944–5955.
- ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103:102274.
- A systematic study and comprehensive evaluation of chatgpt on benchmark datasets. arXiv preprint arXiv:2305.18486.
- Let’s verify step by step. arXiv preprint arXiv:2305.20050.
- Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9):1–35.
- Mind’s Eye: Grounded language model reasoning through simulation. arXiv preprint arXiv:2210.05359.
- Augmented language models: A survey. arXiv preprint arXiv:2302.07842.
- Assessing the quality of student-generated short answer questions using GPT-3. In Educating for a New Future: Making Sense of Technology-Enhanced Learning Adoption: 17th European Conference on Technology Enhanced Learning, EC-TEL 2022, Toulouse, France, September 12–16, 2022, Proceedings, pages 243–257. Springer.
- OpenAI. 2023. Gpt-4 technical report.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
- Are NLP models really able to solve simple math word problems? In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2080–2094.
- Four principles of explainable artificial intelligence. Gaithersburg, Maryland, page 18.
- Adaptive feedback from artificial neural networks facilitates pre-service teachers’ diagnostic reasoning in simulation-based learning. Learning and Instruction, 83:101620.
- Toolformer: Language models can teach themselves to use tools. arXiv preprint arXiv:2302.04761.
- Generate & rank: A multi-task framework for math word problems. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 2269–2279.
- Valerie J Shute. 2008. Focus on formative feedback. Review of Educational Research, 78(1):153–189.
- Anaïs Tack and Chris Piech. 2022. The AI Teacher Test: Measuring the pedagogical ability of Blender and GPT-3 in educational dialogues. In Proceedings of the 15th International Conference on Educational Data Mining, page 522.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
- Deep neural solver for math word problems. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 845–854.
- Chain of thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems.
- Large language models perform diagnostic reasoning.
- Improving math word problems with pre-trained knowledge and hierarchical reasoning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3384–3394.
- Exploring the MIT mathematics and eecs curriculum using large language models. arXiv preprint arXiv:2306.08997.