Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Evaluation of ChatGPT and Microsoft Bing AI Chat Performances on Physics Exams of Vietnamese National High School Graduation Examination (2306.04538v3)

Published 7 Jun 2023 in physics.ed-ph

Abstract: The promise and difficulties of LLM-based approaches for physics teaching were assessed in this study. This study evaluates how well ChatGPT and BingChat, two state-of-the-art (SOTA) LLMs, perform when answering high school physics questions on Vietnamese exams from 2019 to 2023. When we compared the results of the LLMs with the scores of Vietnamese students, we discovered that ChatGPT and BingChat both perform worse than Vietnamese students, proving that LLMs are not yet capable of fully replacing human intellect in the field of physics teaching. The outcomes also showed that neither LLM is capable of responding to questions at the high application levels. In terms of accuracy, BingChat typically surpassed ChatGPT, although ChatGPT showed more stability. Our research suggests that LLMs can help students and teachers during learning and teaching activities, particularly by offering immediate feedback and individualized learning experiences.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. L. Chen, P. Chen, and Z. Lin, “Artificial intelligence in education: A review,” Ieee Access, vol. 8, pp. 75 264–75 278, 2020.
  2. X. Q. Dao, N. B. Le, and T. M. T. Nguyen, “AI-Powered MOOCs: Video Lecture Generation,” ACM International Conference Proceeding Series, pp. 95–102, mar 2021. [Online]. Available: https://dl.acm.org/doi/10.1145/3459212.3459227
  3. T. M. T. Nguyen, T. H. Diep, B. B. Ngo, N. B. Le, and X. Q. Dao, “Design of Online Learning Platform with Vietnamese Virtual Assistant,” ACM International Conference Proceeding Series, pp. 51–57, feb 2021. [Online]. Available: https://dl.acm.org/doi/10.1145/3460179.3460188
  4. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
  5. Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” arXiv preprint arXiv:1907.11692, 2019.
  6. C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,” The Journal of Machine Learning Research, vol. 21, no. 1, pp. 5485–5551, 2020.
  7. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
  8. P. Clark, I. Cowhey, O. Etzioni, T. Khot, A. Sabharwal, C. Schoenick, and O. Tafjord, “Think you have solved question answering? try arc, the ai2 reasoning challenge,” arXiv preprint arXiv:1803.05457, 2018.
  9. O. T. Unke and M. Meuwly, “Physnet: A neural network for predicting energies, forces, dipole moments, and partial charges,” Journal of chemical theory and computation, vol. 15, no. 6, pp. 3678–3693, 2019.
  10. P. Lu, S. Mishra, T. Xia, L. Qiu, K.-W. Chang, S.-C. Zhu, O. Tafjord, P. Clark, and A. Kalyan, “Learn to explain: Multimodal reasoning via thought chains for science question answering,” Advances in Neural Information Processing Systems, vol. 35, pp. 2507–2521, 2022.
  11. D. Xuan-Quy, L. Ngoc-Bich, V. The-Duy, P. Xuan-Dung, N. Bac-Bien, N. Van-Tien, N. Thi-My-Thanh, and N. Hong-Phuoc, “Vnhsge: Vietnamese high school graduation examination dataset for large language models,” arXiv preprint arXiv:2305.12199, 2023.
  12. K. Lehnert, “Ai insights into theoretical physics and the swampland program: A journey through the cosmos with chatgpt,” arXiv preprint arXiv:2301.08155, 2023.
  13. G. Kortemeyer, “Could an artificial-intelligence agent pass an introductory physics course?” Physical Review Physics Education Research, vol. 19, no. 1, p. 010132, 2023.
  14. C. G. West, “Ai and the fci: Can chatgpt project an understanding of introductory physics?” arXiv preprint arXiv:2303.01067, 2023.
  15. S. Küchemann, S. Steinert, N. Revenga, M. Schweinberger, Y. Dinc, K. E. Avila, and J. Kuhn, “Physics task development of prospective physics teachers using chatgpt,” arXiv preprint arXiv:2304.10014, 2023.
  16. W. Yeadon, O.-O. Inyang, A. Mizouri, A. Peach, and C. P. Testrow, “The death of the short-form physics essay in the coming ai revolution,” Physics Education, vol. 58, no. 3, p. 035027, 2023.
  17. OpenAI, “GPT-4 Technical Report,” arXiv preprint arXiv:2303.08774, 2023.
Citations (16)

Summary

We haven't generated a summary for this paper yet.