Papers
Topics
Authors
Recent
Search
2000 character limit reached

DocChecker: Bootstrapping Code Large Language Model for Detecting and Resolving Code-Comment Inconsistencies

Published 10 Jun 2023 in cs.SE | (2306.06347v3)

Abstract: Comments within source code are essential for developers to comprehend the code's purpose and ensure its correct usage. However, as codebases evolve, maintaining an accurate alignment between the comments and the code becomes increasingly challenging. Recognizing the growing interest in automated solutions for detecting and correcting differences between code and its accompanying comments, current methods rely primarily on heuristic rules. In contrast, this paper presents DocChecker, a tool powered by deep learning. DocChecker is adept at identifying inconsistencies between code and comments, and it can also generate synthetic comments. This capability enables the tool to detect and correct instances where comments do not accurately reflect their corresponding code segments. We demonstrate the effectiveness of DocChecker using the Just-In-Time and CodeXGlue datasets in different settings. Particularly, DocChecker achieves a new State-of-the-art result of 72.3% accuracy on the Inconsistency Code-Comment Detection (ICCD) task and 33.64 BLEU-4 on the code summarization task against other LLMs, even surpassing GPT 3.5 and CodeLlama. DocChecker is accessible for use and evaluation. It can be found on our GitHub https://github.com/FSoft-AI4Code/DocChecker and as an Online Tool http://4.193.50.237:5000/. For a more comprehensive understanding of its functionality, a demonstration video is available on YouTube https://youtu.be/FqnPmd531xw.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. Unified pre-training for program understanding and generation. arXiv preprint arXiv:2103.06333.
  2. Coherence of comments and method implementations: a dataset and an empirical investigation. Software Quality Journal, 26:751–777.
  3. BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pages 4171–4186. Association for Computational Linguistics.
  4. Luca Di Grazia and Michael Pradel. 2023. Code search: A survey of techniques for finding code. ACM Comput. Surv., 55(11).
  5. Codebert: A pre-trained model for programming and natural languages.
  6. Unixcoder: Unified cross-modal pre-training for code representation. pages 7212–7225.
  7. Codesearchnet challenge: Evaluating the state of semantic code search. CoRR, abs/1909.09436.
  8. Learning and evaluating contextual embedding of source code. In Proceedings of the 37th International Conference on Machine Learning, ICML’20. JMLR.org.
  9. Align before fuse: Vision and language representation learning with momentum distillation. Advances in neural information processing systems, 34:9694–9705.
  10. Starcoder: may the source be with you! arXiv preprint arXiv:2305.06161.
  11. Automated comment update: How far are we? In 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC), pages 36–46.
  12. Chin-Yew Lin and Franz Josef Och. 2004. Orange: a method for evaluating automatic evaluation metrics for machine translation. In COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics, pages 501–507.
  13. Roberta: A robustly optimized BERT pretraining approach. CoRR, abs/1907.11692.
  14. Just-in-time obsolete comment detection and update. IEEE Transactions on Software Engineering, pages 1–1.
  15. CodeXGLUE: A machine learning benchmark dataset for code understanding and generation. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1).
  16. The vault: A comprehensive multilingual dataset for advancing code understanding and generation.
  17. Codegen: An open large language model for code with multi-turn program synthesis. arXiv preprint arXiv:2203.13474.
  18. Deep just-in-time inconsistency detection between comments and source code. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 427–435.
  19. Fazle Rabbi and Md Saeed Siddik. 2020. Detecting code comment inconsistency using siamese recurrent network. In Proceedings of the 28th International Conference on Program Comprehension, ICPC ’20, page 371–375. Association for Computing Machinery.
  20. Improving language understanding by generative pre-training.
  21. Inderjot Kaur Ratol and Martin P Robillard. 2017. Detecting fragile comments. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 112–122. IEEE.
  22. Code llama: Open foundation models for code. arXiv preprint arXiv:2308.12950.
  23. Are we building on the rock? on the importance of data preprocessing for code summarization. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 107–119.
  24. Theo Steiner and Rui Zhang. 2022. Code comment inconsistency detection with bert and longformer. arXiv preprint arXiv:2207.14444.
  25. On the importance of building high-quality training datasets for neural code search. In Proceedings of the 44th International Conference on Software Engineering, pages 1609–1620.
  26. Intellicode compose: Code generation using transformer. ESEC/FSE 2020, page 1433–1443.
  27. @tcomment: Testing javadoc comments to detect comment-code inconsistencies. In 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation, pages 260–269.
  28. Attention is all you need. Advances in neural information processing systems, 30.
  29. Codet5+: Open code large language models for code understanding and generation. arXiv preprint arXiv:2305.07922.
  30. Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation.
  31. A large-scale empirical study on code-comment inconsistencies. In 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC), pages 53–64. IEEE.
  32. A systematic evaluation of large language models of code. In Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming, MAPS 2022, page 1–10, New York, NY, USA. Association for Computing Machinery.
Citations (3)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.