Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Using Large Language Models for Commit Message Generation: A Preliminary Study (2401.05926v2)

Published 11 Jan 2024 in cs.SE

Abstract: A commit message is a textual description of the code changes in a commit, which is a key part of the Git version control system (VCS). It captures the essence of software updating. Therefore, it can help developers understand code evolution and facilitate efficient collaboration between developers. However, it is time-consuming and labor-intensive to write good and valuable commit messages. Some researchers have conducted extensive studies on the automatic generation of commit messages and proposed several methods for this purpose, such as generationbased and retrieval-based models. However, seldom studies explored whether LLMs can be used to generate commit messages automatically and effectively. To this end, this paper designed and conducted a series of experiments to comprehensively evaluate the performance of popular open-source and closed-source LLMs, i.e., Llama 2 and ChatGPT, in commit message generation. The results indicate that considering the BLEU and Rouge-L metrics, LLMs surpass the existing methods in certain indicators but lag behind in others. After human evaluations, however, LLMs show a distinct advantage over all these existing methods. Especially, in 78% of the 366 samples, the commit messages generated by LLMs were evaluated by humans as the best. This work not only reveals the promising potential of using LLMs to generate commit messages, but also explores the limitations of commonly used metrics in evaluating the quality of auto-generated commit messages.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. M. Linares-Vásquez, L. F. Cortés-Coy, J. Aponte, and D. Poshyvanyk, “Changescribe: A tool for automatically generating commit messages,” in Proceedings of 37th IEEE/ACM International Conference on Software Engineering (ICSE).   ACM, 2015, pp. 709–712.
  2. W. Tao, Y. Wang, E. Shi, L. Du, S. Han, H. Zhang, D. Zhang, and W. Zhang, “On the evaluation of commit message generation models: An experimental study,” in Proceedings of the 37th IEEE International Conference on Software Maintenance and Evolution (ICSME).   IEEE, 2021, pp. 126–136.
  3. X. Hou, Y. Zhao, Y. Liu, Z. Yang, K. Wang, L. Li, X. Luo, D. Lo, J. Grundy, and H. Wang, “Large language models for software engineering: A systematic literature review,” arXiv:2308.10620, 2023.
  4. K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for automatic evaluation of machine translation,” in Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL).   ACL, 2002, pp. 311–318.
  5. C.-Y. Lin, “Rouge: a package for automatic evaluation of summaries,” in Proceedings of the Workshop on Text Summarization Branches Out (WAS).   ACL, 2004, pp. 74–81.
  6. L. Zhang, J. Zhao, C. Wang, and P. Liang, “Replication Package of the Paper: Using Large Language Models for Commit Message Generation: A Preliminary Study,” Nov. 2023. [Online]. Available: https://doi.org/10.5281/zenodo.10164356
  7. S. Jiang, A. Armaly, and C. McMillan, “Automatically generating commit messages from diffs using neural machine translation,” in Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).   IEEE, 2017, pp. 135–146.
  8. P. Loyola, E. Marrese-Taylor, and Y. Matsuo, “A neural architecture for generating natural language descriptions from source code changes,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL).   ACL, 2017, pp. 287–292.
  9. S. Xu, Y. Yao, F. Xu, T. Gu, H. Tong, and J. Lu, “Commit message generation for source code changes,” in Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI).   IJCAI, 2019, pp. 3975–3981.
  10. Q. Liu, Z. Liu, H. Zhu, H. Fan, B. Du, and Y. Qian, “Generating commit messages from diffs using pointer-generator network,” in Proceedings of the 16th IEEE/ACM International Conference on Mining Software Repositories (MSR).   IEEE, 2019, pp. 299–309.
  11. Z. Liu, X. Xia, A. E. Hassan, D. Lo, Z. Xing, and X. Wang, “Neural-machine-translation-based commit message generation: How far are we?” in Proceedings of the 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).   ACM, 2018, pp. 373–384.
  12. S. Jiang and C. McMillan, “Towards automatic generation of short summaries of commits,” in Proceedings of the 25th IEEE/ACM International Conference on Program Comprehension (ICPC).   IEEE, 2017, pp. 320–323.
  13. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, “Language models are unsupervised multitask learners,” OpenAI Blog, vol. 1, no. 8, pp. 1–24, 2019.
  14. G. D. Israel, “Determining sample size,” Florida Cooperative Extension Service, Institute of Food and Agricultural Sciences, University of Florida, Florida, U.S.A, Fact Sheet PEOD-6, November 1992.
  15. J. Wei, X. Wang, D. Schuurmans, M. Bosma, b. ichter, F. Xia, E. Chi, Q. V. Le, and D. Zhou, “Chain-of-thought prompting elicits reasoning in large language models,” in Proceedings of the 35th Annual Conference on Neural Information Processing Systems (NeurIPS), 2022, pp. 24 824–24 837.
Citations (5)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com