Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Challenges and Opportunities of Moderating Usage of Large Language Models in Education (2312.14969v1)

Published 19 Dec 2023 in cs.HC

Abstract: The increased presence of LLMs in educational settings has ignited debates concerning negative repercussions, including overreliance and inadequate task reflection. Our work advocates moderated usage of such models, designed in a way that supports students and encourages critical thinking. We developed two moderated interaction methods with ChatGPT: hint-based assistance and presenting multiple answer choices. In a study with students (N=40) answering physics questions, we compared the effects of our moderated models against two baseline settings: unmoderated ChatGPT access and internet searches. We analyzed the interaction strategies and found that the moderated versions exhibited less unreflected usage (e.g., copy & paste) compared to the unmoderated condition. However, neither ChatGPT-supported condition could match the ratio of reflected usage present in internet searches. Our research highlights the potential benefits of moderating LLMs, showing a research direction toward designing effective AI-supported educational strategies.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. Mark Quaye Affum. 2022. The effect of internet on students studies: a review. (2022).
  2. Developing energy and momentum conceptual survey (EMCS) with four-tier diagnostic test items. In AIP Conference Proceedings, Vol. 1848. AIP Publishing.
  3. Julia Anghileri. 2006. Scaffolding practices that enhance mathematics learning. Journal of Mathematics Teacher Education 9 (2006), 33–52.
  4. Philipp Bitzenbauer. 2023. ChatGPT in physics education: A pilot study on easy-to-implement activities. Contemporary Educational Technology 15, 3 (2023), ep430.
  5. BR. 2023. ChatGPT: So gut hat die KI das bayerische Abitur bestanden https://www.br.de/nachrichten/netzwelt/chatgpt-ki-besteht-bayerisches-abitur-mit-bravour,TfB3QBw.
  6. John Brooke et al. 1996. SUS-A quick and dirty usability scale. Usability evaluation in industry 189, 194 (1996), 4–7.
  7. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
  8. Palm-e: An embodied multimodal language model. arXiv preprint arXiv:2303.03378 (2023).
  9. Bor Gregorcic and Ann-Marie Pendrill. 2023. ChatGPT and the frustrated Socrates. Physics Education 58, 3 (2023), 035021.
  10. Freudbot: An investigation of chatbot technology in distance education. In EdMedia+ Innovate Learning. Association for the Advancement of Computing in Education (AACE), 3913–3918.
  11. Jeff Sauro. 2019. 5 Ways to interpret a SUS score https://measuringu.com/interpret-sus-score/.
  12. Joachim Herz Stiftung. 2023. LEIFIphysikhttps://www.leifiphysik.de/.
  13. Scaling laws for neural language models. arXiv preprint arXiv:2001.08361 (2020).
  14. Marzena Karpinska and Mohit Iyyer. 2023. Large language models effectively leverage document-level context for literary translation, but critical errors persist. arXiv preprint arXiv:2304.03245 (2023).
  15. Ali Kashefi and Tapan Mukerji. 2023. Chatgpt for programming numerical methods. Journal of Machine Learning for Modeling and Computing (2023).
  16. ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences 103 (2023), 102274.
  17. Educational data augmentation in physics education research using ChatGPT. arXiv preprint arXiv:2307.14475 (2023).
  18. Unreflected Acceptance – Investigating the Negative Consequences of ChatGPT-Assisted Problem Solving in Physics Education. arXiv:2309.03087 [physics.ed-ph]
  19. Physics task development of prospective physics teachers using ChatGPT. arXiv preprint arXiv:2304.10014 (2023).
  20. Interacting with educational chatbots: A systematic review. Education and Information Technologies 28, 1 (2023), 973–1018.
  21. J Richard Landis and Gary G Koch. 1977. The measurement of observer agreement for categorical data. biometrics (1977), 159–174.
  22. Leibniz-Institut für die Pädagogik der Naturwissenschaften und Mathematik. 2023. ScienceOlympiaden https://www.scienceolympiaden.de/.
  23. UMUX-LITE: when there’s no time for the SUS. In Proceedings of the SIGCHI conference on human factors in computing systems. 2099–2102.
  24. Validation and structural analysis of the kinematics concept test. Physical Review Physics Education Research 13, 1 (2017), 010115.
  25. GitHub Copilot AI pair programmer: Asset or Liability? Journal of Systems and Software 203 (2023), 111734. https://doi.org/10.1016/j.jss.2023.111734
  26. OpenAI. 2022. Introducing ChatGPT https://openai.com/blog/chatgpt.
  27. OpenAI. 2023. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023).
  28. Continued Progress: Promising Evidence on Personalized Learning. Rand Corporation (2015).
  29. Rediscovering the use of chatbots in education: A systematic literature review. Computer Applications in Engineering Education 28, 6 (2020), 1549–1565.
  30. Improving language understanding by generative pre-training.
  31. Vatsal Raina and Mark Gales. 2022. Multiple-Choice Question Generation: Towards an Automated Assessment Framework. arXiv preprint arXiv:2209.11830 (2022).
  32. Reuters. 2023. ChatGPT fever spreads to US workplace, sounding alarm for some https://www.reuters.com/technology/chatgpt-fever-spreads-us-workplace-sounding-alarm-some-2023-08-11/.
  33. Quizbot: A dialogue-based adaptive learning system for factual knowledge. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–13.
  34. War of the chatbots: Bard, Bing Chat, ChatGPT, Ernie and beyond. The new AI gold rush and its impact on higher education. Journal of Applied Learning and Teaching 6, 1 (2023).
  35. Renato P dos Santos. 2023. Enhancing Physics Learning with ChatGPT, Bing Chat, and Bard as Agents-to-Think-With: A Comparative Case Study. arXiv preprint arXiv:2306.00724 (2023).
  36. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023).
  37. What Should Data Science Education Do with Large Language Models? arXiv preprint arXiv:2307.02792 (2023).
  38. University of Cambridge. 2023. ChatGPT (We need to talk) https://www.cam.ac.uk/stories/ChatGPT-and-education.
  39. Attention is all you need. Advances in neural information processing systems 30 (2017).
  40. Document-level machine translation with large language models. arXiv preprint arXiv:2304.02210 (2023).
  41. Colin G West. 2023a. Advances in apparent conceptual physics reasoning in ChatGPT-4. arXiv preprint arXiv:2303.17012 (2023).
  42. Colin G West. 2023b. AI and the FCI: Can ChatGPT project an understanding of introductory physics? arXiv preprint arXiv:2303.01067 (2023).
  43. The Aligned Rank Transform for Nonparametric Factorial Analyses Using Only Anova Procedures. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’11). ACM, New York, NY, USA, 143–146. https://doi.org/10.1145/1978942.1978963
  44. Are we there yet?-a systematic literature review on chatbots in education. Frontiers in artificial intelligence 4 (2021), 654924.
  45. Wordcraft: story writing with large language models. In 27th International Conference on Intelligent User Interfaces. 841–852.
  46. Greaselm: Graph reasoning enhanced language models for question answering. arXiv preprint arXiv:2201.08860 (2022).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Lars Krupp (14 papers)
  2. Steffen Steinert (10 papers)
  3. Maximilian Kiefer-Emmanouilidis (23 papers)
  4. Karina E. Avila (11 papers)
  5. Paul Lukowicz (90 papers)
  6. Jochen Kuhn (23 papers)
  7. Stefan Küchemann (22 papers)
  8. Jakob Karolus (12 papers)
Citations (5)