Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Large Language Models in Introductory Programming Education: ChatGPT's Performance and Implications for Assessments (2308.08572v1)

Published 15 Aug 2023 in cs.SE, cs.AI, and cs.HC

Abstract: This paper investigates the performance of the LLMs ChatGPT-3.5 and GPT-4 in solving introductory programming tasks. Based on the performance, implications for didactic scenarios and assessment formats utilizing LLMs are derived. For the analysis, 72 Python tasks for novice programmers were selected from the free site CodingBat. Full task descriptions were used as input to the LLMs, while the generated replies were evaluated using CodingBat's unit tests. In addition, the general availability of textual explanations and program code was analyzed. The results show high scores of 94.4 to 95.8% correct responses and reliable availability of textual explanations and program code, which opens new ways to incorporate LLMs into programming education and assessment.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Natalie Kiesler (17 papers)
  2. Daniel Schiffner (2 papers)
Citations (18)