2000 character limit reached
Large Language Models for Code Summarization (2405.19032v1)
Published 29 May 2024 in cs.AI, cs.LG, cs.PL, and cs.SE
Abstract: Recently, there has been increasing activity in using deep learning for software engineering, including tasks like code generation and summarization. In particular, the most recent coding LLMs seem to perform well on these problems. In this technical report, we aim to review how these models perform in code explanation/summarization, while also investigating their code generation capabilities (based on natural language descriptions).
- AI@Meta. Llama 3 model card.
- Bi-lstm-based neural source code summarization. Applied Sciences 12, 24 (2022), 12587.
- Santacoder: don’t reach for the stars!, 2023.
- Program synthesis with large language models, 2021.
- Evaluating large language models trained on code, 2021.
- Learning phrase representations using rnn encoder-decoder for statistical machine translation, 2014.
- Incoder: A generative model for code infilling and synthesis, 2023.
- Deepseek-coder: When the large language model meets programming – the rise of code intelligence, 2024.
- Measuring coding challenge competence with apps, 2021.
- Codesearchnet challenge: Evaluating the state of semantic code search, 2020.
- Summarizing source code using a neural attention model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (Berlin, Germany, Aug. 2016), K. Erk and N. A. Smith, Eds., Association for Computational Linguistics, pp. 2073–2083.
- Spoc: Search-based pseudocode to code. In Advances in Neural Information Processing Systems (2019), H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, Eds., vol. 32, Curran Associates, Inc.
- Ds-1000: A natural and reliable benchmark for data science code generation, 2022.
- Starcoder: may the source be with you!, 2023.
- Lin, C.-Y. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out (Barcelona, Spain, July 2004), Association for Computational Linguistics, pp. 74–81.
- ORANGE: a method for evaluating automatic evaluation metrics for machine translation. In COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics (Geneva, Switzerland, aug 23–aug 27 2004), COLING, pp. 501–507.
- Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation, 2023.
- Codexglue: A machine learning benchmark dataset for code understanding and generation, 2021.
- Wizardcoder: Empowering code large language models with evol-instruct. arXiv preprint arXiv:2306.08568 (2023).
- Octopack: Instruction tuning code large language models. arXiv preprint arXiv:2308.07124 (2023).
- Gpt-4 technical report, 2024.
- Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (USA, 2002), ACL ’02, Association for Computational Linguistics, p. 311–318.
- Humaneval-xl: A multilingual code generation benchmark for cross-lingual natural language generalization, 2024.
- Code llama: Open foundation models for code, 2023.
- Automatic source code summarization with extended tree-lstm. In 2019 International Joint Conference on Neural Networks (IJCNN) (2019), pp. 1–8.
- Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems (2014), Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Weinberger, Eds., vol. 27, Curran Associates, Inc.
- Gemini: A family of highly capable multimodal models, 2023.
- Llama 2: Open foundation and fine-tuned chat models, 2023.
- Attention is all you need. In Advances in Neural Information Processing Systems (2017), I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30, Curran Associates, Inc.
- Magicoder: Source code is all you need, 2023.
- Wizardlm: Empowering large language models to follow complex instructions, 2023.
- Wavecoder: Widespread and versatile enhanced instruction tuning with refined data generation, 2024.
- Codegeex: A pre-trained model for code generation with multilingual evaluations on humaneval-x, 2023.