Can We Trust the AI Pair Programmer? Copilot for API Misuse Detection and Correction (2509.16795v1)
Abstract: API misuse introduces security vulnerabilities, system failures, and increases maintenance costs, all of which remain critical challenges in software development. Existing detection approaches rely on static analysis or machine learning-based tools that operate post-development, which delays defect resolution. Delayed defect resolution can significantly increase the cost and complexity of maintenance and negatively impact software reliability and user trust. AI-powered code assistants, such as GitHub Copilot, offer the potential for real-time API misuse detection within development environments. This study evaluates GitHub Copilot's effectiveness in identifying and correcting API misuse using MUBench, which provides a curated benchmark of misuse cases. We construct 740 misuse examples, manually and via AI-assisted variants, using correct usage patterns and misuse specifications. These examples and 147 correct usage cases are analyzed using Copilot integrated in Visual Studio Code. Copilot achieved a detection accuracy of 86.2%, precision of 91.2%, and recall of 92.4%. It performed strongly on common misuse types (e.g., missing-call, null-check) but struggled with compound or context-sensitive cases. Notably, Copilot successfully fixed over 95% of the misuses it identified. These findings highlight both the strengths and limitations of AI-driven coding assistants, positioning Copilot as a promising tool for real-time pair programming and detecting and fixing API misuses during software development.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.