Evaluating the Energy-Efficiency of the Code Generated by LLMs (2505.20324v1)

Published 23 May 2025 in cs.SE and cs.AI

Abstract: As the quality of code generated by LLMs improves, their adoption in the software industry for automated code generation continues to grow. Researchers primarily focus on enhancing the functional correctness of the generated code while commonly overlooking its energy efficiency and environmental impact. This paper investigates the energy efficiency of the code generated by 20 popular LLMs for 878 programming problems of varying difficulty levels and diverse algorithmic categories selected from the LeetCode platform by comparing them against canonical human-written solutions. Although LLMs can produce functionally correct results in most cases, our findings show that the performance and energy efficiency of LLM-produced solutions are often far below those of human-written solutions. Among the studied LLMs, DeepSeek-v3 and GPT-4o generate the most energy-efficient code, whereas Grok-2 and Gemini-1.5-Pro are among the least energy-efficient models. On average, human-generated canonical solutions are approximately 1.17 times more energy efficient than DeepSeek-v3, 1.21 times more energy efficient than GPT-4o, and over 2 times more energy efficient than Grok-2 and Gemini-1.5-Pro. For specific algorithmic groups such as dynamic programming, backtracking, and bit manipulation, LLM-generated code can consume up to 450 times more energy than human-generated canonical solutions.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/sillyCatsAllDay/status/1928875966635356429

Evaluating the Energy-Efficiency of the Code Generated by LLMs (2505.20324v1)

Summary

Related Papers

Tweets