Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Curriculum Learning for Small Code Language Models (2407.10194v1)

Published 14 Jul 2024 in cs.LG, cs.AI, and cs.PL

Abstract: Code LLMs have emerged as useful tools for various programming tasks, yet they often struggle when it comes to complex ones. In this paper, we explore the potential of curriculum learning in enhancing the performance of these models. While prior research has suggested that curriculum learning does not necessarily help in improving the performance of LLMs, our results surprisingly show that this may not be the case for code LLMs. We demonstrate that a well-designed curriculum learning approach significantly improves the accuracy of small decoder-only code LLMs on the task of code execution, while its effect on code completion is less significant. To explore the potential of curriculum learning, we train multiple GPT models with 1 million parameters each to predict the next token and evaluate them on code completion and execution tasks. Our contributions include proposing a novel code difficulty assessment metric by combining software code measures, investigating the effectiveness of Curriculum Learning for code LLMs, and introducing a Novel Curriculum Learning schedule that enhances the performance of small decoder-only LLMs in code execution tasks. The results of this paper open the door for more research on the use of curriculum learning for code LLMs.

Summary

We haven't generated a summary for this paper yet.