Impeding LLM-assisted Cheating in Introductory Programming Assignments via Adversarial Perturbation (2410.09318v2)

Published 12 Oct 2024 in cs.CL, cs.CY, and cs.SE

Abstract: While LLM-based programming assistants such as CoPilot and ChatGPT can help improve the productivity of professional software developers, they can also facilitate cheating in introductory computer programming courses. Assuming instructors have limited control over the industrial-strength models, this paper investigates the baseline performance of 5 widely used LLMs on a collection of introductory programming problems, examines adversarial perturbations to degrade their performance, and describes the results of a user study aimed at understanding the efficacy of such perturbations in hindering actual code generation for introductory programming assignments. The user study suggests that i) perturbations combinedly reduced the average correctness score by 77%, ii) the drop in correctness caused by these perturbations was affected based on their detectability.

Summary

The paper demonstrates that adversarial perturbation can lower LLM-generated code correctness by up to 77%.
It evaluates five LLMs on CS1 and CS2 problems, revealing no correct solutions for CS1 and moderate success for CS2.
A user study shows students detect 67% of perturbations, yet struggle to reverse changes, reinforcing educational integrity.

An Assessment of Adversarial Techniques Against LLM-Assisted Cheating in Introductory Programming

In the paper "Impeding LLM-assisted Cheating in Introductory Programming Assignments via Adversarial Perturbation," the authors explore the use of adversarial techniques to combat cheating facilitated by LLM tools such as ChatGPT in educational settings, particularly in introductory programming courses. This paper is prompted by the capabilities of LLMs like GitHub Copilot and ChatGPT, which, although beneficial in professional software development, pose a risk of misuse among students as aids in bypassing learning requirements.

Study Overview

The research investigates the performance of five prevalent LLMs on programming assignments from introductory courses (CS1 and CS2) at the University of Arizona. The selected models, including GPT-3.5, GitHub Copilot, and Code Llama, were evaluated on a combination of short and long programming problems. Notably, results indicated that all models struggled with CS1 problems, generating correct solutions for none of the assignments, and fared modestly better with CS2 problems where GitHub Copilot achieved the highest performance, averaging 51.5% correctness in short problems.

The authors propose an approach focused on adversarial perturbation to degrade LLM performance in code generation tasks. They designed ten perturbation techniques, divided into core and exploratory tactics, altering problems by minor text changes or the use of Unicode characters to confuse LLMs. These perturbations reduced the average correctness score of generated code by up to 77%, thereby demonstrating efficacy in impeding LLM-assisted cheating. Core perturbation techniques, including character removal and replacing tokens with synonyms or numerics, were effective, while exploratory techniques such as random replacements addressed the sensitivity of LLMs to input variations.

Practical Implications and User Study

A significant portion of the research involved a user paper with participants who had previously completed the relevant courses, aiming to understand how effectively students could detect and reverse perturbed assignments when using ChatGPT. Results from the user paper indicated that students noticed perturbations in 67% of cases, yet correcting these perturbations was less successful. The effectiveness of perturbations was less when detected, showing a critical balance between subtlety and impact.

Implications and Future Directions

The paper offers a pragmatic solution for educators facing the challenge of LLM-assisted cheating. By employing adversarial perturbation in problem statements, instructors can complicate LLM attempts at effortless code generation, thereby encouraging more significant student engagement with original problem-solving. This technique represents an adjustable and proactive measure rather than a passive detection strategy, directly addressing potential academic integrity issues.

The research prompts further exploration into robust educational practices to mitigate technologically assisted cheating while maintaining educational objectives. Future research may involve refining perturbation techniques to optimize the balance between making problems unsolvable for LLMs and keeping them comprehensible for genuine student efforts. Moreover, as LLM capabilities evolve, the strategies against LLM misuse must be continually assessed and adapted.

In summary, this paper offers insightful contributions to the discourse on integrating LLMs into educational environments, proposing adversarial perturbation as a viable technique to address the misuse of AI tools in academic contexts. It emphasizes the need for innovative solutions to maintain academic integrity in the face of rapidly advancing AI technologies.