Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Are your comments outdated? Towards automatically detecting code-comment consistency (2403.00251v1)

Published 1 Mar 2024 in cs.SE

Abstract: In software development and maintenance, code comments can help developers understand source code, and improve communication among developers. However, developers sometimes neglect to update the corresponding comment when changing the code, resulting in outdated comments (i.e., inconsistent codes and comments). Outdated comments are dangerous and harmful and may mislead subsequent developers. More seriously, the outdated comments may lead to a fatal flaw sometime in the future. To automatically identify the outdated comments in source code, we proposed a learning-based method, called CoCC, to detect the consistency between code and comment. To efficiently identify outdated comments, we extract multiple features from both codes and comments before and after they change. Besides, we also consider the relation between code and comment in our model. Experiment results show that CoCC can effectively detect outdated comments with precision over 90%. In addition, we have identified the 15 most important factors that cause outdated comments, and verified the applicability of CoCC in different programming languages. We also used CoCC to find outdated comments in the latest commits of open source projects, which further proves the effectiveness of the proposed method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. Parnas DL. Precise documentation: The key to better software. In: The Future of Software Engineering. Springer. 2011 (pp. 125–148).
  2. Keyes J. Software engineering handbook. Auerbach Publications . 2002.
  3. Rani P. Speculative analysis for quality assessment of code comments. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). IEEE. ; 2021: 299–303.
  4. Addison-Wesley Professional . 2000.
  5. doi: https://doi.org/10.1016/j.jss.2019.03.010
  6. Ho TK. Random decision forests. In: . 1 of Proceedings of 3rd international conference on document analysis and recognition. IEEE. ; 1995: 278–282.
  7. Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. ACM 2016.
  8. Dreiseitl S, Ohno-Machado L. Logistic regression and artificial neural network classification models: a methodology review. Journal of biomedical informatics 2002; 35(5-6): 352–359.
  9. Cortes C, Vapnik V. Support-vector networks. Machine learning 1995; 20(3): 273–297.
  10. Song YY, Ying L. Decision tree methods: applications for classification and prediction. Shanghai archives of psychiatry 2015; 27(2): 130.
  11. doi: 10.1109/TSE.2007.70731
  12. Mens T, Tourwé T. A survey of software refactoring. IEEE Transactions on software engineering 2004; 30(2): 126–139.
  13. McBurney PW, McMillan C. An empirical study of the textual similarity between source code and source code summaries. Empirical Software Engineering 2016; 21(1): 17–42.
  14. Quinlan JR. Simplifying decision trees. International journal of man-machine studies 1987; 27(3): 221–234.
  15. Domingos P, Pazzani M. Beyond independence: Conditions for the optimality of the simple bayesian classi er. In: Proc. 13th Intl. Conf. Machine Learning. Citeseer. ; 1996: 105–112.
  16. Zadrozny B, Elkan C. Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers. In: . 1 of Icml. Citeseer. ; 2001: 609–616.
  17. doi: 10.1109/TSE.2021.3138909
  18. Arafat O, Riehle D. The commenting practice of open source. ACM 2009.
  19. Sridhara , Giriprasad . Automatically Detecting the Up-To-Date Status of ToDo Comments in Java Programs. 2016: 16-25.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com