Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 44 tok/s Pro
GPT-5 Medium 20 tok/s Pro
GPT-5 High 31 tok/s Pro
GPT-4o 100 tok/s Pro
Kimi K2 177 tok/s Pro
GPT OSS 120B 434 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Investigating Efficacy of Perplexity in Detecting LLM-Generated Code (2412.16525v1)

Published 21 Dec 2024 in cs.SE

Abstract: LLM-generated code (LLMgCode) has become increasingly prevalent in software development. Many studies report that LLMgCode has more quality and security issues than human-authored code (HaCode). It is common for LLMgCode to mix with HaCode in a code change, while the change is signed by only human developers, without being carefully checked. Many automated methods have been proposed to detect LLMgCode from HaCode, in which the perplexity-based method (PERPLEXITY for short) is the state-of-the-art method. However, the efficacy evaluation of PERPLEXITY has focused on the detection accuracy. In this article, we are interested in whether PERPLEXITY is good enough in a wider range of realistic evaluation settings. To this end, we devise a large-scale dataset that includes 11,664 HaCode snippets and 13,164 LLMgCode snippets, and based on that, we carry out a family of experiments to compare PERPLEXITY against feature-based and pre-training-based methods from three perspectives: (1) detection accuracy in terms of programming language, degree of difficulty, and scale of solution, (2) generalization capability, and (3) inference efficiency. The experimental results show that PERPLEXITY has the best generalization capability while it has low accuracy and efficiency in most cases. Based on the experimental results and detection mechanism of PERPLEXITY, we discuss implications into both the strengths and limitations of PERPLEXITY, e.g., PERPLEXITY is unsuitable for high-level programming languages while it has good interpretability. As the first large-scale investigation on detecting LLMgCode from HaCode, this article provides a wide range of evidence for future improvement.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: