IRCoCo: Immediate Rewards-Guided Deep Reinforcement Learning for Code Completion (2401.16637v3)
Abstract: Code completion aims to enhance programming productivity by predicting potential code based on the current programming context. Recently, pretrained LLMs (LMs) have become prominent in this field. Various approaches have been proposed to fine-tune LMs using supervised fine-tuning (SFT) techniques for code completion. However, the inherent exposure bias of these models can cause errors to accumulate early in the sequence completion, leading to even more errors in subsequent completions. To address this problem, deep reinforcement learning (DRL) is an alternative technique for fine-tuning LMs for code completion, which can improve the generalization capabilities and overall performance. Nevertheless, integrating DRL-based strategies into code completion faces two major challenges: 1) The dynamic nature of the code context requires the completion model to quickly adapt to changes, which poses difficulties for conventional DRL strategies that focus on delayed rewarding of the final code state. 2) It is difficult to evaluate the correctness of partial code, thus the reward redistribution-based strategies cannot be adapted to code completion. To tackle these challenges, we propose IRCoCo, a code completion-specific DRL-based fine-tuning framework. This framework is designed to provide immediate rewards as feedback for detecting dynamic context changes arising from continuous edits during code completion. With the aid of immediate feedback, the fine-tuned LM can gain a more precise understanding of the current context, thereby enabling effective adjustment of the LM and optimizing code completion in a more refined manner. Experimental results demonstrate that fine-tuning pretrained LMs with IRCoCo leads to significant improvements in the code completion task, outperforming both SFT-based and other DRL-based baselines.
- Code prediction by feeding trees to transformers. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pages 150–162. IEEE, 2021.
- Code completion with neural attention and pointer networks. arXiv preprint arXiv:1711.09573, 2017.
- Code completion by modeling flattened abstract syntax trees as graphs. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 14015–14023, 2021.
- Pythia: Ai-assisted code completion system. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pages 2727–2735, 2019.
- Towards full-line code completion with neural language models. arXiv preprint arXiv:2009.08603, 2020.
- Reacc: A retrieval-augmented code completion framework. arXiv preprint arXiv:2203.07722, 2022.
- Codefill: Multi-token code completion by jointly learning from structure and naming sequences. In Proceedings of the 44th International Conference on Software Engineering, pages 401–412, 2022.
- Nat Friedman. Introducing github copilot: your ai pair programmer. URL https://github. blog/2021-06-29-introducing-github-copilot-ai-pair-programmer, 2021.
- C Amazon. Ai code generator—amazon codewhisperer, 2023.
- Do not give away my secrets: Uncovering the privacy issue of neural code completion tools. arXiv preprint arXiv:2309.07639, 2023.
- samsung-engineers-sensitive-data-chatgpt-warnings-ai-use-workplace. https://www.darkreading.com/vulnerabilities-threats/samsung-engineers-sensitive-data-chatgpt-warnings-ai-use-workplace, 2023. [Online; accessed 1-April-202].
- Multi-task learning based pre-trained language model for code completion. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, pages 473–485, 2020.
- Scheduled sampling for sequence prediction with recurrent neural networks. Advances in neural information processing systems, 28, 2015.
- Sequence level training with recurrent neural networks. arXiv preprint arXiv:1511.06732, 2015.
- Execution-based code generation using deep reinforcement learning. arXiv preprint arXiv:2301.13816, 2023.
- Coderl: Mastering code generation through pretrained models and deep reinforcement learning. Advances in Neural Information Processing Systems, 35:21314–21328, 2022.
- Probabilistic model for code with decision trees. ACM SIGPLAN Notices, 51(10):731–747, 2016.
- Mining source code repositories at massive scale using language modeling. In 2013 10th working conference on mining software repositories (MSR), pages 207–216. IEEE, 2013.
- Learning to prevent profitless neural code completion. arXiv preprint arXiv:2209.05948, 2022.
- Richard Bellman. A markovian decision process. Journal of mathematics and mechanics, pages 679–684, 1957.
- Measuring coding challenge competence with apps. arXiv preprint arXiv:2105.09938, 2021.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
- Codexglue: A machine learning benchmark dataset for code understanding and generation. arXiv preprint arXiv:2102.04664, 2021.
- Codegen: An open large language model for code with multi-turn program synthesis. arXiv preprint arXiv:2203.13474, 2022.
- Starcoder: may the source be with you! arXiv preprint arXiv:2305.06161, 2023.
- Codet5+: Open code large language models for code understanding and generation. arXiv preprint, 2023.
- Intellicode compose: Code generation using transformer. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 1433–1443, 2020.
- Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318, 2002.
- Codebleu: a method for automatic evaluation of code synthesis. arXiv preprint arXiv:2009.10297, 2020.
- Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv preprint arXiv:2109.00859, 2021.
- An actor-critic algorithm for sequence prediction. arXiv preprint arXiv:1607.07086, 2016.
- Out of the bleu: how should we assess quality of the code generation models? Journal of Systems and Software, 203:111741, 2023.
- Reassessing automatic evaluation metrics for code summarization tasks. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 1105–1116, 2021.
- Martin Stubenschrott. A context sensitive code completion system for the c and c++ programming languages., 2005.
- Learning from examples to improve code completion systems. In Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering, pages 213–222, 2009.
- Towards improving statistical modeling of software engineering data: think locally, act globally! Empirical Software Engineering, 20:294–335, 2015.
- Neural code completion. 2016.
- Cctest: Testing and repairing code completion systems. arXiv preprint arXiv:2208.08289, 2022.
- Improving automatic source code summarization via deep reinforcement learning. In Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, pages 397–407, 2018.
- Learning what and where to draw. Advances in neural information processing systems, 29, 2016.
- Decision transformer: Reinforcement learning via sequence modeling. Advances in neural information processing systems, 34:15084–15097, 2021.
- OpenAI. Chatgpt. https://OpenAI.com/blog/ChatGPT, 2022. Accessed: yyyy-mm-dd.
- OpenAI OpenAI. Gpt-4 technical report. Mar 2023.
- Code llama: Open foundation models for code. arXiv preprint arXiv:2308.12950, 2023.