Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The Right Prompts for the Job: Repair Code-Review Defects with Large Language Model (2312.17485v1)

Published 29 Dec 2023 in cs.SE

Abstract: Automatic program repair (APR) techniques have the potential to reduce manual efforts in uncovering and repairing program defects during the code review (CR) process. However, the limited accuracy and considerable time costs associated with existing APR approaches hinder their adoption in industrial practice. One key factor is the under-utilization of review comments, which provide valuable insights into defects and potential fixes. Recent advancements in LLMs have enhanced their ability to comprehend natural and programming languages, enabling them to generate patches based on review comments. This paper conducts a comprehensive investigation into the effective utilization of LLMs for repairing CR defects. In this study, various prompts are designed and compared across mainstream LLMs using two distinct datasets from human reviewers and automated checkers. Experimental results demonstrate a remarkable repair rate of 72.97% with the best prompt, highlighting a substantial improvement in the effectiveness and practicality of automatic repair techniques.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Evaluating Large Language Models Trained on Code. arXiv:2107.03374 https://arxiv.org/abs/2107.03374
  2. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. (2019). arXiv:1810.04805 [cs.CL]
  3. Out of the BLEU: How Should We Assess Quality of the Code Generation Models? J. Syst. Softw. 203, C (jul 2023), 17 pages. https://doi.org/10.1016/j.jss.2023.111741
  4. InCoder: A Generative Model for Code Infilling and Synthesis. ArXiv abs/2204.05999 (2022). https://api.semanticscholar.org/CorpusID:248157108
  5. Automatic Software Repair: A Survey. IEEE Transactions on Software Engineering 45, 1 (2019), 34–67. https://doi.org/10.1109/TSE.2017.2755013
  6. A Survey on Automated Program Repair Techniques. arXiv:2303.18184 [cs.SE]
  7. AntGroup Inc. 2023a. CodeFuse. https://github.com/codefuse-ai Accessed: 2023-09-22.
  8. Facebook Inc. 2023b. Infer. https://github.com/facebook/infer Accessed: 2023-09-22.
  9. OpenAI Inc. 2023c. ChatGPT. https://openai.com/blog/chatgpt Accessed: 2023-09-22.
  10. InferFix: End-to-End Program Repair with LLMs. arXiv:2303.07263 [cs.SE]
  11. Automatic Patch Generation Learned from Human-Written Patches. In Proceedings of the 2013 International Conference on Software Engineering (ICSE ’13). 802–811.
  12. History Driven Program Repair. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 1. 213–224. https://doi.org/10.1109/SANER.2016.76
  13. GenProg: A Generic Method for Automatic Software Repair. IEEE Transactions on Software Engineering 38, 1 (2012), 54–72. https://doi.org/10.1109/TSE.2011.104
  14. Automating Code Review Activities by Large-Scale Pre-Training. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022). 1035–1047. https://doi.org/10.1145/3540250.3549081
  15. Automatic Inference of Code Transforms for Patch Generation. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2017). 727–739. https://doi.org/10.1145/3106237.3106253
  16. CoCoNuT: Combining Context-Aware Neural Translation Models Using Ensemble for Program Repair. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2020). 101–114. https://doi.org/10.1145/3395363.3397369
  17. Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis. In Proceedings of the 38th International Conference on Software Engineering (ICSE ’16). 691–701. https://doi.org/10.1145/2884781.2884807
  18. SemFix: Program Repair via Semantic Analysis. In Proceedings of the 2013 International Conference on Software Engineering (ICSE ’13). 772–781.
  19. CodeGen2: Lessons for Training LLMs on Programming and Natural Languages. arXiv:2305.02309 [cs.LG]
  20. PMD. 2023. PMD. https://pmd.github.io Accessed: 2023-09-22.
  21. Efficient Automated Program Repair through Fault-Recorded Testing Prioritization. In 2013 IEEE International Conference on Software Maintenance. 180–189. https://doi.org/10.1109/ICSM.2013.29
  22. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res. 21, 1, Article 140 (jan 2020), 67 pages.
  23. CodeBLEU: a Method for Automatic Evaluation of Code Synthesis. ArXiv abs/2009.10297 (2020). https://api.semanticscholar.org/CorpusID:221836101
  24. Code Llama: Open Foundation Models for Code. arXiv:2308.12950 [cs.CL]
  25. Automated Repair of Binary and Assembly Programs for Cooperating Embedded Devices. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’13). 317–328. https://doi.org/10.1145/2451116.2451151
  26. Automated Program Repair through the Evolution of Assembly Code. In Proceedings of the 25th IEEE/ACM International Conference on Automated Software Engineering (ASE ’10). 313–316. https://doi.org/10.1145/1858996.1859059
  27. Spotbugs. 2023. Spotbugs. https://github.com/spotbugs/spotbugs Accessed: 2023-09-22.
  28. Shin Hwei Tan and Abhik Roychoudhury. 2015. Relifix: Automated Repair of Software Regressions. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (ICSE ’15). 471–482.
  29. LLaMA: Open and Efficient Foundation Language Models. arXiv:2302.13971 [cs.CL]
  30. TreeSitter. 2023. TreeSitter. https://tree-sitter.github.io/tree-sitter Accessed: 2023-09-22.
  31. An Empirical Study on Learning Bug-Fixing Patches in the Wild via Neural Machine Translation. ACM Trans. Softw. Eng. Methodol. 28, 4, Article 19 (sep 2019), 29 pages. https://doi.org/10.1145/3340544
  32. Towards Automating Code Review Activities. In Proceedings of the 43rd International Conference on Software Engineering (ICSE ’21). 163–174. https://doi.org/10.1109/ICSE43902.2021.00027
  33. CodeT5+: Open Code Large Language Models for Code Understanding and Generation. arXiv preprint (2023).
  34. Automatically finding patches using genetic programming. In 2009 IEEE 31st International Conference on Software Engineering. 364–374. https://doi.org/10.1109/ICSE.2009.5070536
  35. A Survey on Software Fault Localization. IEEE Transactions on Software Engineering 42, 8 (2016), 707–740. https://doi.org/10.1109/TSE.2016.2521368
  36. Nopol: Automatic Repair of Conditional Statement Bugs in Java Programs. IEEE Trans. Softw. Eng. 43, 1 (jan 2017), 34–55. https://doi.org/10.1109/TSE.2016.2560811
  37. CIRCLE: Continual Repair across Programming Languages. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2022). 678–690. https://doi.org/10.1145/3533767.3534219
  38. A Survey of Learning-based Automated Program Repair. arXiv:2301.03270 [cs.SE]
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Zelin Zhao (12 papers)
  2. Zhaogui Xu (3 papers)
  3. Jialong Zhu (1 paper)
  4. Peng Di (16 papers)
  5. Yuan Yao (292 papers)
  6. Xiaoxing Ma (27 papers)
Citations (2)