Comparison of Three Programming Error Measures for Explaining Variability in CS1 Grades (2404.05988v1)
Abstract: Programming courses can be challenging for first year university students, especially for those without prior coding experience. Students initially struggle with code syntax, but as more advanced topics are introduced across a semester, the difficulty in learning to program shifts to learning computational thinking (e.g., debugging strategies). This study examined the relationships between students' rate of programming errors and their grades on two exams. Using an online integrated development environment, data were collected from 280 students in a Java programming course. The course had two parts. The first focused on introductory procedural programming and culminated with exam 1, while the second part covered more complex topics and object-oriented programming and ended with exam 2. To measure students' programming abilities, 51095 code snapshots were collected from students while they completed assignments that were autograded based on unit tests. Compiler and runtime errors were extracted from the snapshots, and three measures -- Error Count, Error Quotient and Repeated Error Density -- were explored to identify the best measure explaining variability in exam grades. Models utilizing Error Quotient outperformed the models using the other two measures, in terms of the explained variability in grades and Bayesian Information Criterion. Compiler errors were significant predictors of exam 1 grades but not exam 2 grades; only runtime errors significantly predicted exam 2 grades. The findings indicate that leveraging Error Quotient with multiple error types (compiler and runtime) may be a better measure of students' introductory programming abilities, though still not explaining most of the observed variability.
- Alex Aiken. 2022. Moss: A System for Detecting Software Similarity. Retrieved April 1, 2024 from https://theory.stanford.edu/~aiken/moss/
- Detecting students-at-risk in computer programming classes with learning analytics from students’ digital footprints. User Modeling and User-Adapted Interaction 29 (2019), 759–788. https://doi.org/10.1007/s11257-019-09234-7
- Brett A. Becker. 2016. A New Metric to Quantify Repeated Compiler Errors for Novice Programmers. In Proceedings of the 2016 ACM Conference on Innovation and Technology in Computer Science Education (Arequipa, Peru) (ITiCSE ’16). Association for Computing Machinery, New York, NY, USA, 296–301. https://doi.org/10.1145/2899415.2899463
- Compiler Error Messages Considered Unhelpful: The Landscape of Text-Based Programming Error Message Research. In Proceedings of the Working Group Reports on Innovation and Technology in Computer Science Education (Aberdeen, Scotland, UK) (ITiCSE-WGR ’19). Association for Computing Machinery, New York, NY, USA, 177–210. https://doi.org/10.1145/3344429.3372508
- Blackbox, Five Years On: An Evaluation of a Large-scale Programming Data Collection Project. In Proceedings of the 2018 ACM Conference on International Computing Education Research (Espoo, Finland) (ICER ’18). Association for Computing Machinery, New York, NY, USA, 196–204. https://doi.org/10.1145/3230977.3230991
- Adam Scott Carter and Christopher David Hundhausen. 2017. Using Programming Process Data to Detect Differences in Students’ Patterns of Programming. In Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education (Seattle, Washington, USA) (SIGCSE ’17). Association for Computing Machinery, New York, NY, USA, 105–110. https://doi.org/10.1145/3017680.3017785
- The Normalized Programming State Model: Predicting Student Performance in Computing Courses Based on Programming Behavior. In Proceedings of the Eleventh Annual International Conference on International Computing Education Research (Omaha, Nebraska, USA) (ICER ’15). Association for Computing Machinery, New York, NY, USA, 141–150. https://doi.org/10.1145/2787622.2787710
- Codio. 2024a. Codio — The Hands-On Platform for Computing & Tech Skills Education. Retrieved April 1, 2024 from https://www.codio.com/
- Codio. 2024b. How to Review Code Using Codio’s Code Playback Tool. Retrieved April 1, 2024 from https://www.codio.com/features/code-playback
- All Syntax Errors Are Not Equal. In Proceedings of the 17th ACM Annual Conference on Innovation and Technology in Computer Science Education (Haifa, Israel) (ITiCSE ’12). Association for Computing Machinery, New York, NY, USA, 75–80. https://doi.org/10.1145/2325296.2325318
- Combining Big Data and Thick Data Analyses for Understanding Youth Learning Trajectories in a Summer Coding Camp. In Proceedings of the 47th ACM Technical Symposium on Computing Science Education (Memphis, Tennessee, USA) (SIGCSE ’16). Association for Computing Machinery, New York, NY, USA, 150–155. https://doi.org/10.1145/2839509.2844631
- Educational Data Mining and Learning Analytics in Programming: Literature Review and Case Studies. In Proceedings of the 2015 ITiCSE on Working Group Reports (Vilnius, Lithuania) (ITICSE-WGR ’15). ACM, New York, NY, USA, 41–63. https://doi.org/10.1145/2858796.2858798
- Matthew C. Jadud. 2006a. An Exploration of Novice Compilation Behaviour in BlueJ. Ph. D. Dissertation. University of Kent. https://kar.kent.ac.uk/86458/
- Matthew C. Jadud. 2006b. Methods and Tools for Exploring Novice Compilation Behaviour. In Proceedings of the Second International Workshop on Computing Education Research (Canterbury, United Kingdom) (ICER ’06). Association for Computing Machinery, New York, NY, USA, 73–84. https://doi.org/10.1145/1151588.1151600
- Matthew C. Jadud and Brian Dorn. 2015. Aggregate Compilation Behavior: Findings and Implications from 27,698 Users. In Proceedings of the Eleventh Annual International Conference on International Computing Education Research (Omaha, Nebraska, USA) (ICER ’15). Association for Computing Machinery, New York, NY, USA, 131–139. https://doi.org/10.1145/2787622.2787718
- John D Kloke and Joseph W McKean. 2012. Rfit: Rank-based Estimation for Linear Models. The R Journal 4, 2 (2012), 57. https://svn.r-project.org/Rjournal/html/archive/2012/RJ-2012-014/RJ-2012-014.pdf
- Amy J. Ko and Brad A. Myers. 2005. A framework and methodology for studying the causes of software errors in programming systems. Journal of Visual Languages & Computing 16, 1 (2005), 41–84. https://doi.org/10.1016/j.jvlc.2004.08.003
- Using Large Language Models to Enhance Programming Error Messages. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (SIGCSE 2023). Association for Computing Machinery, New York, NY, USA, 563–569. https://doi.org/10.1145/3545945.3569770
- Roy D. Pea. 1986. Language-Independent Conceptual “Bugs” in Novice Programming. Journal of Educational Computing Research 2, 1 (1986), 25–36. https://doi.org/10.2190/689T-1R2A-X4W4-29J2
- Generating High-Precision Feedback for Programming Syntax Errors using Large Language Models. In Proceedings of the 16th International Conference on Educational Data Mining. International Educational Data Mining Society, USA, 370–377. https://doi.org/10.5281/zenodo.8115653
- Yizhou Qian and James Lehman. 2020. An Investigation of High School Students’ Errors in Introductory Programming: A Data-Driven Approach. Journal of Educational Computing Research 58, 5 (2020), 919–945. https://doi.org/10.1177/0735633119887508
- Keith Quille and Susan Bergin. 2019. CS1: how will they do? How can we help? A decade of research and practice. Computer Science Education 29, 2–3 (2019), 254–282. https://doi.org/10.1080/08993408.2019.1612679
- Adrian E. Raftery. 1995. Bayesian Model Selection in Social Research. Sociological Methodology 25 (1995), 111–163. http://www.jstor.org/stable/271063
- An Analysis of Java Programming Behaviors, Affect, Perceptions, and Syntax Errors among Low-Achieving, Average, and High-Achieving Novice Programmers. Journal of Educational Computing Research 49, 3 (2013), 293–325. https://doi.org/10.2190/EC.49.3.b
- A Systematic Literature Review of Methodology of Learning Evaluation Based on Item Response Theory in the Context of Programming Teaching. In 2020 IEEE Frontiers in Education Conference (FIE). IEEE, New York, NY, USA, 1–9. https://doi.org/10.1109/FIE44824.2020.9274068
- Predicting At-Risk Novice Java Programmers through the Analysis of Online Protocols. In Proceedings of the Seventh International Workshop on Computing Education Research (Providence, Rhode Island, USA) (ICER ’11). Association for Computing Machinery, New York, NY, USA, 85–92. https://doi.org/10.1145/2016911.2016930
- Christine Lourrine S Tablatin and Ma Mercedes T Rodrigo. 2020. The Relationship of Compilation Behavior Metrics and Student Performance in Introductory Programming Course. Journal of Engineering, Technology, and Computing Sciences 2, 2 (2020), 8–15. https://www.psurj.org/wp-content/uploads/2021/07/JETCS_2020_Vol2_Issue2_02.pdf
- Turnitin. 2024. Gradescope. Retrieved April 1, 2024 from https://www.gradescope.com/
- An Item Response Theory Evaluation of a Language-Independent CS1 Knowledge Assessment. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education (Minneapolis, MN, USA) (SIGCSE ’19). Association for Computing Machinery, New York, NY, USA, 699–705. https://doi.org/10.1145/3287324.3287370
- Exploring the Impact of Voluntary Practice and Procrastination in an Introductory Programming Course. In Proceedings of the 53rd ACM Technical Symposium on Computer Science Education - Volume 1 (Providence, RI, USA) (SIGCSE 2022). Association for Computing Machinery, New York, NY, USA, 356–361. https://doi.org/10.1145/3478431.3499350
- Valdemar Švábenský (34 papers)
- Maciej Pankiewicz (3 papers)
- Jiayi Zhang (159 papers)
- Elizabeth B. Cloude (2 papers)
- Ryan S. Baker (17 papers)
- Eric Fouh (1 paper)