TAMIGO: Empowering Teaching Assistants using LLM-assisted viva and code assessment in an Advanced Computing Class (2407.16805v1)
Abstract: LLMs have significantly transformed the educational landscape, offering new tools for students, instructors, and teaching assistants. This paper investigates the application of LLMs in assisting teaching assistants (TAs) with viva and code assessments in an advanced computing class on distributed systems in an Indian University. We develop TAMIGO, an LLM-based system for TAs to evaluate programming assignments. For viva assessment, the TAs generated questions using TAMIGO and circulated these questions to the students for answering. The TAs then used TAMIGO to generate feedback on student answers. For code assessment, the TAs selected specific code blocks from student code submissions and fed it to TAMIGO to generate feedback for these code blocks. The TAMIGO-generated feedback for student answers and code blocks was used by the TAs for further evaluation. We evaluate the quality of LLM-generated viva questions, model answers, feedback on viva answers, and feedback on student code submissions. Our results indicate that LLMs are highly effective at generating viva questions when provided with sufficient context and background information. However, the results for LLM-generated feedback on viva answers were mixed; instances of hallucination occasionally reduced the accuracy of feedback. Despite this, the feedback was consistent, constructive, comprehensive, balanced, and did not overwhelm the TAs. Similarly, for code submissions, the LLM-generated feedback was constructive, comprehensive and balanced, though there was room for improvement in aligning the feedback with the instructor-provided rubric for code evaluation. Our findings contribute to understanding the benefits and limitations of integrating LLMs into educational settings.
- [n. d.]. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research — ncbi.nlm.nih.gov. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4913118/. [Accessed 23-07-2024].
- Kamil Malinka at al. 2023. On the Educational Impact of ChatGPT: Is Artificial Intelligence Ready to Obtain a University Degree?. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1 (Turku, Finland) (ITiCSE 2023). Association for Computing Machinery, New York, NY, USA, 47–53. https://doi.org/10.1145/3587102.3588827
- Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative Research in Psychology 3, 2 (2006), 77–101. https://doi.org/10.1191/1478088706qp063oa arXiv:https://www.tandfonline.com/doi/pdf/10.1191/1478088706qp063oa
- Bruno Pereira Cipriano and Pedro Alves. 2023. GPT-3 vs Object Oriented Programming Assignments: An Experience Report. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1 (Turku, Finland) (ITiCSE 2023). Association for Computing Machinery, New York, NY, USA, 61–67. https://doi.org/10.1145/3587102.3588814
- Marian Daun and Jennifer Brings. 2023. How ChatGPT Will Change Software Engineering Education. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1 (Turku, Finland) (ITiCSE 2023). Association for Computing Machinery, New York, NY, USA, 110–116.
- Conversing with Copilot: Exploring Prompt Engineering for Solving CS1 Problems Using Natural Language. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (Toronto ON, Canada) (SIGCSE 2023). Association for Computing Machinery, New York, NY, USA, 1136–1142. https://doi.org/10.1145/3545945.3569823
- Can ChatGPT Play the Role of a Teaching Assistant in an Introductory Programming Course? arXiv:2312.07343 [cs.HC] https://arxiv.org/abs/2312.07343
- The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming. In Proceedings of the 24th Australasian Computing Education Conference (Virtual Event, Australia) (ACE ’22). Association for Computing Machinery, New York, NY, USA, 10–19.
- My AI Wants to Know If This Will Be on the Exam: Testing OpenAI’s Codex on CS2 Programming Exercises. In Proceedings of the 25th Australasian Computing Education Conference (Melbourne, VIC, Australia) (ACE ’23). Association for Computing Machinery, New York, NY, USA, 97–104.
- Programming Is Hard - Or at Least It Used to Be: Educational Opportunities and Challenges of AI Code Generation. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (Toronto ON, Canada) (SIGCSE 2023). Association for Computing Machinery, New York, NY, USA, 500–506. https://doi.org/10.1145/3545945.3569759
- Using Large Language Models to Enhance Programming Error Messages. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (Toronto ON, Canada) (SIGCSE 2023). Association for Computing Machinery, New York, NY, USA, 563–569. https://doi.org/10.1145/3545945.3569770
- “It’s not like Jarvis, but it’s pretty close!” - Examining ChatGPT’s Usage among Undergraduate Students in Computer Science. In Proceedings of the 26th Australasian Computing Education Conference (, Sydney, NSW, Australia,) (ACE ’24). Association for Computing Machinery, New York, NY, USA, 124–133. https://doi.org/10.1145/3636243.3636257
- Learning from Teaching Assistants to Program with Subgoals: Exploring the Potential for AI Teaching Assistants. arXiv:2309.10419 [cs.HC]
- Concept-Based Automated Grading of CS-1 Programming Assignments. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis (, Seattle, WA, USA,) (ISSTA 2023). Association for Computing Machinery, New York, NY, USA, 199–210. https://doi.org/10.1145/3597926.3598049
- Experiences with TA-Bot in CS1. In Proceedings of the ACM Conference on Global Computing Education Vol 1 (, Hyderabad, India,) (CompEd 2023). Association for Computing Machinery, New York, NY, USA, 57–63.
- The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming. In Proceedings of the 24th Australasian Computing Education Conference (, Virtual Event, Australia,) (ACE ’22). Association for Computing Machinery, New York, NY, USA, 10–19. https://doi.org/10.1145/3511861.3511863
- ChatGPT in the Classroom: An Analysis of Its Strengths and Weaknesses for Solving Undergraduate Computer Science Questions. arXiv:2304.14993 [cs.HC]
- ”With Great Power Comes Great Responsibility!”: Student and Instructor Perspectives on the influence of LLMs on Undergraduate Engineering Education. arXiv:2309.10694 [cs.HC]
- CodeAid: Evaluating a Classroom Deployment of an LLM-based Programming Assistant that Balances Student and Educator Needs. arXiv preprint arXiv:2401.11314 (2024).
- Comparing Code Explanations Created by Students and Large Language Models. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1 (Turku, Finland) (ITiCSE 2023). Association for Computing Machinery, New York, NY, USA, 124–130.
- Experiences from Using Code Explanations Generated by Large Language Models in a Web Software Development E-Book. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (Toronto ON, Canada) (SIGCSE 2023). Association for Computing Machinery, New York, NY, USA, 931–937. https://doi.org/10.1145/3545945.3569785
- CodeHelp: Using Large Language Models with Guardrails for Scalable Support in Programming Classes. In Proceedings of the 23rd Koli Calling International Conference on Computing Education Research (Koli, Finland) (Koli Calling ’23). Association for Computing Machinery, New York, NY, USA, Article 8, 11 pages. https://doi.org/10.1145/3631802.3631830
- ChatGPT, Can You Generate Solutions for My Coding Exercises? An Evaluation on Its Effectiveness in an Undergraduate Java Programming Course.. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1 (Turku, Finland) (ITiCSE 2023). Association for Computing Machinery, New York, NY, USA, 54–60. https://doi.org/10.1145/3587102.3588794
- Evaluating the Performance of Code Generation Models for Solving Parsons Problems With Small Prompt Variations. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1 (Turku, Finland) (ITiCSE 2023). Association for Computing Machinery, New York, NY, USA, 299–305. https://doi.org/10.1145/3587102.3588805
- Investigating the Potential of GPT-3 in Providing Feedback for Programming Assessments. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1 (Turku, Finland) (ITiCSE 2023). Association for Computing Machinery, New York, NY, USA, 292–298. https://doi.org/10.1145/3587102.3588852
- Automatic Generation of Programming Exercises and Code Explanations Using Large Language Models. In Proceedings of the 2022 ACM Conference on International Computing Education Research - Volume 1 (Lugano and Virtual Event, Switzerland) (ICER ’22). Association for Computing Machinery, New York, NY, USA, 27–43. https://doi.org/10.1145/3501385.3543957
- Can Generative Pre-Trained Transformers (GPT) Pass Assessments in Higher Education Programming Courses?. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1 (Turku, Finland) (ITiCSE 2023). Association for Computing Machinery, New York, NY, USA, 117–123. https://doi.org/10.1145/3587102.3588792
- AI-TA: Towards an Intelligent Question-Answer Teaching Assistant using Open-Source LLMs. arXiv:2311.02775 [cs.LG]
- Exploring the Responses of Large Language Models to Beginner Programmers’ Help Requests. In Proceedings of the 2023 ACM Conference on International Computing Education Research V.1 (ICER 2023). ACM. https://doi.org/10.1145/3568813.3600139
- Sam Lau and Philip J. Guo. 2023. From ”Ban It Till We Understand It” to ”Resistance is Futile”: How University Programming Instructors Plan to Adapt as More Students Use AI Code Generation and Explanation Tools such as ChatGPT and GitHub Copilot. In Proceedings of the 2023 ACM Conference on International Computing Education Research - Volume 1, ICER 2023, Chicago, IL, USA, August 7-11, 2023, Kathi Fisler et al. (Ed.). ACM, 106–121.
- Automatic Grading of Programming Assignments: An Approach Based on Formal Semantics. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering Education and Training (ICSE-SEET). 126–137. https://doi.org/10.1109/ICSE-SEET.2019.00022
- Meta. [n. d.]. Meta Llama 3. https://llama.meta.com/llama3/
- Author Name. 2024. tamigo. https://github.com/tamigo-research/tamigo. Accessed: 2024-07-22.
- Filippa Nilsson and Jonatan Tuvstedt. 2023. GPT-4 as an Automatic Grader : The accuracy of grades set by GPT-4 on introductory programming assignments. , 37 pages.
- OpenAI. 2023. GPT-4 Technical Report. https://arxiv.org/abs/2303.08774v2
- Generating High-Precision Feedback for Programming Syntax Errors using Large Language Models. In Proceedings of the 16th International Conference on Educational Data Mining, Mingyu Feng, Tanja Käser, and Partha Talukdar (Eds.). International Educational Data Mining Society, Bengaluru, India, 370–377.
- Gemini Team. 2024. Gemini: A Family of Highly Capable Multimodal Models. arXiv:2312.11805 [cs.CL] https://arxiv.org/abs/2312.11805
- Michel Wermelinger. 2023. Using GitHub Copilot to Solve Simple Programming Problems. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (Toronto ON, Canada) (SIGCSE 2023). Association for Computing Machinery, New York, NY, USA, 172–178.
- Anishka IIITD (1 paper)
- Diksha Sethi (2 papers)
- Nipun Gupta (2 papers)
- Shikhar Sharma (15 papers)
- Srishti Jain (1 paper)
- Ujjwal Singhal (3 papers)
- Dhruv Kumar (41 papers)