Using Large Language Model to Solve and Explain Physics Word Problems Approaching Human Level (2309.08182v2)
Abstract: Our work demonstrates that LLM pre-trained on texts can not only solve pure math word problems, but also physics word problems, whose solution requires calculation and inference based on prior physical knowledge. We collect and annotate the first physics word problem dataset-PhysQA, which contains over 1000 junior high school physics word problems (covering Kinematics, Mass&Density, Mechanics, Heat, Electricity). Then we use OpenAI' s GPT3.5 to generate the answer of these problems and found that GPT3.5 could automatically solve 49.3% of the problems through zero-shot learning and 73.2% through few-shot learning. This result demonstrates that by using similar problems and their answers as prompt, LLM could solve elementary physics word problems approaching human level performance. In addition to solving problems, GPT3.5 can also summarize the knowledge or topics covered by the problems, provide relevant explanations, and generate new physics word problems based on the input. Our work is the first research to focus on the automatic solving, explanation, and generation of physics word problems across various types and scenarios, and we achieve an acceptable and state-of-the-art accuracy. This underscores the potential of LLMs for further applications in secondary education.
- Savitha Sam Abraham and Deepak Khemani. 2016. Hybrid of qualitative and quantitative knowledge models for solving physics word problems. In The Twenty-Ninth International Flairs Conference.
- Mathqa: Towards interpretable math word problem solving with operation-based formalisms. In Proceedings of NAACL-HLT, pages 2357–2367.
- Avi Bleiweiss. 2019. Neural sequence modeling in physical language understanding. In IJCCI, pages 464–472.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
- Geoqa: A geometric question answering benchmark towards multimodal numerical reasoning. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 513–523.
- A neural network solves, explains, and generates university math problems by program synthesis and few-shot learning at human level. Proceedings of the National Academy of Sciences, 119(32):e2123433119.
- Measuring massive multitask language understanding. arXiv preprint arXiv:2009.03300.
- How well do computers solve math word problems? large-scale dataset construction and evaluation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 887–896.
- Mathprompter: Mathematical reasoning using large language models. arXiv preprint arXiv:2303.05398.
- Elif Ince. 2018. An overview of problem solving studies in physics education. Journal of Education and Learning, 7(4):191–200.
- Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pages 4171–4186.
- Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics, 3:585–597.
- Megan Leszczynski and Jose Moreira. 2016. Machine solver for physics word problems.
- Deeponet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv preprint arXiv:1910.03193.
- Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191.
- Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics.
- Subhro Roy and Dan Roth. 2015. Solving general arithmetic word problems. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1743–1752.
- Translating a math word problem to a expression tree. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1064–1069.
- Mathdqn: Solving arithmetic word problems via deep reinforcement learning. In Proceedings of the AAAI conference on artificial intelligence, volume 32.
- Deep neural solver for math word problems. In Proceedings of the 2017 conference on empirical methods in natural language processing, pages 845–854.
- Z. Xie and S. Sun. 2019. A goal-driven tree-structured neural model for math word problems. In Twenty-Eighth International Joint Conference on Artificial Intelligence IJCAI-19.
- Logicsolver: Towards interpretable math word problem solving with logical prompt-enhanced learning. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 1–13.
- The gap of semantic parsing: A survey on automatic math word problem solvers. IEEE transactions on pattern analysis and machine intelligence, 42(9):2287–2305.
- A survey of large language models. arXiv preprint arXiv:2303.18223.
- Jingzhe Ding (3 papers)
- Yan Cen (4 papers)
- Xinyuan Wei (9 papers)