Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 165 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 28 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 127 tok/s Pro
Kimi K2 200 tok/s Pro
GPT OSS 120B 433 tok/s Pro
Claude Sonnet 4.5 32 tok/s Pro
2000 character limit reached

Impeding LLM-assisted Cheating in Introductory Programming Assignments via Adversarial Perturbation (2410.09318v2)

Published 12 Oct 2024 in cs.CL, cs.CY, and cs.SE

Abstract: While LLM-based programming assistants such as CoPilot and ChatGPT can help improve the productivity of professional software developers, they can also facilitate cheating in introductory computer programming courses. Assuming instructors have limited control over the industrial-strength models, this paper investigates the baseline performance of 5 widely used LLMs on a collection of introductory programming problems, examines adversarial perturbations to degrade their performance, and describes the results of a user study aimed at understanding the efficacy of such perturbations in hindering actual code generation for introductory programming assignments. The user study suggests that i) perturbations combinedly reduced the average correctness score by 77%, ii) the drop in correctness caused by these perturbations was affected based on their detectability.

Summary

  • The paper demonstrates that adversarial perturbation can lower LLM-generated code correctness by up to 77%.
  • It evaluates five LLMs on CS1 and CS2 problems, revealing no correct solutions for CS1 and moderate success for CS2.
  • A user study shows students detect 67% of perturbations, yet struggle to reverse changes, reinforcing educational integrity.

An Assessment of Adversarial Techniques Against LLM-Assisted Cheating in Introductory Programming

In the paper "Impeding LLM-assisted Cheating in Introductory Programming Assignments via Adversarial Perturbation," the authors explore the use of adversarial techniques to combat cheating facilitated by LLM tools such as ChatGPT in educational settings, particularly in introductory programming courses. This paper is prompted by the capabilities of LLMs like GitHub Copilot and ChatGPT, which, although beneficial in professional software development, pose a risk of misuse among students as aids in bypassing learning requirements.

Study Overview

The research investigates the performance of five prevalent LLMs on programming assignments from introductory courses (CS1 and CS2) at the University of Arizona. The selected models, including GPT-3.5, GitHub Copilot, and Code Llama, were evaluated on a combination of short and long programming problems. Notably, results indicated that all models struggled with CS1 problems, generating correct solutions for none of the assignments, and fared modestly better with CS2 problems where GitHub Copilot achieved the highest performance, averaging 51.5% correctness in short problems.

The authors propose an approach focused on adversarial perturbation to degrade LLM performance in code generation tasks. They designed ten perturbation techniques, divided into core and exploratory tactics, altering problems by minor text changes or the use of Unicode characters to confuse LLMs. These perturbations reduced the average correctness score of generated code by up to 77%, thereby demonstrating efficacy in impeding LLM-assisted cheating. Core perturbation techniques, including character removal and replacing tokens with synonyms or numerics, were effective, while exploratory techniques such as random replacements addressed the sensitivity of LLMs to input variations.

Practical Implications and User Study

A significant portion of the research involved a user paper with participants who had previously completed the relevant courses, aiming to understand how effectively students could detect and reverse perturbed assignments when using ChatGPT. Results from the user paper indicated that students noticed perturbations in 67% of cases, yet correcting these perturbations was less successful. The effectiveness of perturbations was less when detected, showing a critical balance between subtlety and impact.

Implications and Future Directions

The paper offers a pragmatic solution for educators facing the challenge of LLM-assisted cheating. By employing adversarial perturbation in problem statements, instructors can complicate LLM attempts at effortless code generation, thereby encouraging more significant student engagement with original problem-solving. This technique represents an adjustable and proactive measure rather than a passive detection strategy, directly addressing potential academic integrity issues.

The research prompts further exploration into robust educational practices to mitigate technologically assisted cheating while maintaining educational objectives. Future research may involve refining perturbation techniques to optimize the balance between making problems unsolvable for LLMs and keeping them comprehensible for genuine student efforts. Moreover, as LLM capabilities evolve, the strategies against LLM misuse must be continually assessed and adapted.

In summary, this paper offers insightful contributions to the discourse on integrating LLMs into educational environments, proposing adversarial perturbation as a viable technique to address the misuse of AI tools in academic contexts. It emphasizes the need for innovative solutions to maintain academic integrity in the face of rapidly advancing AI technologies.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.