Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Toward Semi-Automatic Misconception Discovery Using Code Embeddings (2103.04448v1)

Published 7 Mar 2021 in cs.LG, cs.CY, and cs.SE

Abstract: Understanding students' misconceptions is important for effective teaching and assessment. However, discovering such misconceptions manually can be time-consuming and laborious. Automated misconception discovery can address these challenges by highlighting patterns in student data, which domain experts can then inspect to identify misconceptions. In this work, we present a novel method for the semi-automated discovery of problem-specific misconceptions from students' program code in computing courses, using a state-of-the-art code classification model. We trained the model on a block-based programming dataset and used the learned embedding to cluster incorrect student submissions. We found these clusters correspond to specific misconceptions about the problem and would not have been easily discovered with existing approaches. We also discuss potential applications of our approach and how these misconceptions inform domain-specific insights into students' learning processes.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yang Shi (107 papers)
  2. Krupal Shah (1 paper)
  3. Wengran Wang (4 papers)
  4. Samiha Marwan (4 papers)
  5. Poorvaja Penmetsa (1 paper)
  6. Thomas W. Price (2 papers)
Citations (30)