The Right Prompts for the Job: Repair Code-Review Defects with Large Language Model (2312.17485v1)
Abstract: Automatic program repair (APR) techniques have the potential to reduce manual efforts in uncovering and repairing program defects during the code review (CR) process. However, the limited accuracy and considerable time costs associated with existing APR approaches hinder their adoption in industrial practice. One key factor is the under-utilization of review comments, which provide valuable insights into defects and potential fixes. Recent advancements in LLMs have enhanced their ability to comprehend natural and programming languages, enabling them to generate patches based on review comments. This paper conducts a comprehensive investigation into the effective utilization of LLMs for repairing CR defects. In this study, various prompts are designed and compared across mainstream LLMs using two distinct datasets from human reviewers and automated checkers. Experimental results demonstrate a remarkable repair rate of 72.97% with the best prompt, highlighting a substantial improvement in the effectiveness and practicality of automatic repair techniques.
- Evaluating Large Language Models Trained on Code. arXiv:2107.03374 https://arxiv.org/abs/2107.03374
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. (2019). arXiv:1810.04805 [cs.CL]
- Out of the BLEU: How Should We Assess Quality of the Code Generation Models? J. Syst. Softw. 203, C (jul 2023), 17 pages. https://doi.org/10.1016/j.jss.2023.111741
- InCoder: A Generative Model for Code Infilling and Synthesis. ArXiv abs/2204.05999 (2022). https://api.semanticscholar.org/CorpusID:248157108
- Automatic Software Repair: A Survey. IEEE Transactions on Software Engineering 45, 1 (2019), 34–67. https://doi.org/10.1109/TSE.2017.2755013
- A Survey on Automated Program Repair Techniques. arXiv:2303.18184 [cs.SE]
- AntGroup Inc. 2023a. CodeFuse. https://github.com/codefuse-ai Accessed: 2023-09-22.
- Facebook Inc. 2023b. Infer. https://github.com/facebook/infer Accessed: 2023-09-22.
- OpenAI Inc. 2023c. ChatGPT. https://openai.com/blog/chatgpt Accessed: 2023-09-22.
- InferFix: End-to-End Program Repair with LLMs. arXiv:2303.07263 [cs.SE]
- Automatic Patch Generation Learned from Human-Written Patches. In Proceedings of the 2013 International Conference on Software Engineering (ICSE ’13). 802–811.
- History Driven Program Repair. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 1. 213–224. https://doi.org/10.1109/SANER.2016.76
- GenProg: A Generic Method for Automatic Software Repair. IEEE Transactions on Software Engineering 38, 1 (2012), 54–72. https://doi.org/10.1109/TSE.2011.104
- Automating Code Review Activities by Large-Scale Pre-Training. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022). 1035–1047. https://doi.org/10.1145/3540250.3549081
- Automatic Inference of Code Transforms for Patch Generation. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2017). 727–739. https://doi.org/10.1145/3106237.3106253
- CoCoNuT: Combining Context-Aware Neural Translation Models Using Ensemble for Program Repair. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2020). 101–114. https://doi.org/10.1145/3395363.3397369
- Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis. In Proceedings of the 38th International Conference on Software Engineering (ICSE ’16). 691–701. https://doi.org/10.1145/2884781.2884807
- SemFix: Program Repair via Semantic Analysis. In Proceedings of the 2013 International Conference on Software Engineering (ICSE ’13). 772–781.
- CodeGen2: Lessons for Training LLMs on Programming and Natural Languages. arXiv:2305.02309 [cs.LG]
- PMD. 2023. PMD. https://pmd.github.io Accessed: 2023-09-22.
- Efficient Automated Program Repair through Fault-Recorded Testing Prioritization. In 2013 IEEE International Conference on Software Maintenance. 180–189. https://doi.org/10.1109/ICSM.2013.29
- Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res. 21, 1, Article 140 (jan 2020), 67 pages.
- CodeBLEU: a Method for Automatic Evaluation of Code Synthesis. ArXiv abs/2009.10297 (2020). https://api.semanticscholar.org/CorpusID:221836101
- Code Llama: Open Foundation Models for Code. arXiv:2308.12950 [cs.CL]
- Automated Repair of Binary and Assembly Programs for Cooperating Embedded Devices. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’13). 317–328. https://doi.org/10.1145/2451116.2451151
- Automated Program Repair through the Evolution of Assembly Code. In Proceedings of the 25th IEEE/ACM International Conference on Automated Software Engineering (ASE ’10). 313–316. https://doi.org/10.1145/1858996.1859059
- Spotbugs. 2023. Spotbugs. https://github.com/spotbugs/spotbugs Accessed: 2023-09-22.
- Shin Hwei Tan and Abhik Roychoudhury. 2015. Relifix: Automated Repair of Software Regressions. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (ICSE ’15). 471–482.
- LLaMA: Open and Efficient Foundation Language Models. arXiv:2302.13971 [cs.CL]
- TreeSitter. 2023. TreeSitter. https://tree-sitter.github.io/tree-sitter Accessed: 2023-09-22.
- An Empirical Study on Learning Bug-Fixing Patches in the Wild via Neural Machine Translation. ACM Trans. Softw. Eng. Methodol. 28, 4, Article 19 (sep 2019), 29 pages. https://doi.org/10.1145/3340544
- Towards Automating Code Review Activities. In Proceedings of the 43rd International Conference on Software Engineering (ICSE ’21). 163–174. https://doi.org/10.1109/ICSE43902.2021.00027
- CodeT5+: Open Code Large Language Models for Code Understanding and Generation. arXiv preprint (2023).
- Automatically finding patches using genetic programming. In 2009 IEEE 31st International Conference on Software Engineering. 364–374. https://doi.org/10.1109/ICSE.2009.5070536
- A Survey on Software Fault Localization. IEEE Transactions on Software Engineering 42, 8 (2016), 707–740. https://doi.org/10.1109/TSE.2016.2521368
- Nopol: Automatic Repair of Conditional Statement Bugs in Java Programs. IEEE Trans. Softw. Eng. 43, 1 (jan 2017), 34–55. https://doi.org/10.1109/TSE.2016.2560811
- CIRCLE: Continual Repair across Programming Languages. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2022). 678–690. https://doi.org/10.1145/3533767.3534219
- A Survey of Learning-based Automated Program Repair. arXiv:2301.03270 [cs.SE]
- Zelin Zhao (12 papers)
- Zhaogui Xu (3 papers)
- Jialong Zhu (1 paper)
- Peng Di (16 papers)
- Yuan Yao (292 papers)
- Xiaoxing Ma (27 papers)