Papers
Topics
Authors
Recent
2000 character limit reached

Less is More: DocString Compression in Code Generation (2410.22793v3)

Published 30 Oct 2024 in cs.SE

Abstract: The widespread use of LLMs in software engineering has intensified the need for improved model and resource efficiency. In particular, for neural code generation, LLMs are used to translate function/method signature and DocString to executable code. DocStrings which capture user re quirements for the code and used as the prompt for LLMs, often contains redundant information. Recent advancements in prompt compression have shown promising results in NLP, but their applicability to code generation remains uncertain. Our empirical study show that the state-of-the-art prompt compression methods achieve only about 10% reduction, as further reductions would cause significant performance degradation. In our study, we propose a novel compression method, ShortenDoc, dedicated to DocString compression for code generation. Our extensive experiments on six code generation datasets, five open-source LLMs (1B to 10B parameters), and one closed-source LLM GPT-4o confirm that ShortenDoc achieves 25-40% compression while preserving the quality of generated code, outperforming other baseline methods at similar compression levels. The benefit of this research is to improve efficiency and reduce the cost while maintaining the quality of the generated code, especially when calling third-party APIs, and is able to reduce the token processing cost by 25-40%.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. Unified pre-training for program understanding and generation. arXiv preprint arXiv:2103.06333 (2021).
  2. Program synthesis with large language models. arXiv preprint arXiv:2108.07732 (2021).
  3. Qwen technical report. arXiv preprint arXiv:2309.16609 (2023).
  4. Tom B Brown. 2020. Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020).
  5. Sahil Chaudhary. 2023. Code alpaca: An instruction-following llama model for code generation. GitHub repository (2023).
  6. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021).
  7. Chatunitest: A framework for llm-based test generation. In Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering. 572–576.
  8. Can docstring reformulation with an LLM improve code generation?. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop. 296–312.
  9. Do Code Summarization Models Process Too Much Information? Function Signature May Be All That Is Needed. ACM Transactions on Software Engineering and Methodology 33, 6 (2024), 1–35.
  10. The llama 3 herd of models. arXiv preprint arXiv:2407.21783 (2024).
  11. A comparative analysis of large language models for code documentation generation. In Proceedings of the 1st ACM International Conference on AI-Powered Software. 65–73.
  12. Flab-Pruner. 2024. Flab-Pruner: Towards Greener Yet Powerful Code Intelligence via Structural Pruning. https://github.com/Flab-Pruner/Flab-Pruner.
  13. Joseph L Fleiss. 1971. Measuring nominal scale agreement among many raters. Psychological bulletin 76, 5 (1971), 378.
  14. DeepSeek-Coder: When the Large Language Model Meets Programming–The Rise of Code Intelligence. arXiv preprint arXiv:2401.14196 (2024).
  15. LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 13358–13376.
  16. Longllmlingua: Accelerating and enhancing llms in long context scenarios via prompt compression. arXiv preprint arXiv:2310.06839 (2023).
  17. Efficient Memory Management for Large Language Model Serving with PagedAttention. In Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles.
  18. AceCoder: An Effective Prompting Technique Specialized in Code Generation. ACM Transactions on Software Engineering and Methodology (2024).
  19. Starcoder: may the source be with you! arXiv preprint arXiv:2305.06161 (2023).
  20. Competition-level code generation with alphacode. Science 378, 6624 (2022), 1092–1097.
  21. Compressing Context to Enhance Inference Efficiency of Large Language Models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 6342–6353.
  22. Latent predictor networks for code generation. arXiv preprint arXiv:1603.06744 (2016).
  23. Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation. Advances in Neural Information Processing Systems 36 (2024).
  24. CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1).
  25. Wizardcoder: Empowering code large language models with evol-instruct. arXiv preprint arXiv:2306.08568 (2023).
  26. Antonio Valerio Miceli-Barone and Rico Sennrich. 2017. A Parallel Corpus of Python Functions and Documentation Strings for Automated Code Documentation and Code Generation. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 314–319.
  27. Adapler: Speeding up inference by adaptive length reduction. arXiv preprint arXiv:2203.08991 (2022).
  28. On end-to-end program generation from user intention by deep neural networks. arXiv preprint arXiv:1510.07211 (2015).
  29. Learning to compress prompts with gist tokens. Advances in Neural Information Processing Systems 36 (2024).
  30. Using an llm to help with code understanding. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering. 1–13.
  31. Text and code embeddings by contrastive pre-training. arXiv preprint arXiv:2201.10005 (2022).
  32. Codegen: An open large language model for code with multi-turn program synthesis. arXiv preprint arXiv:2203.13474 (2022).
  33. On Evaluating the Efficiency of Source Code Generated by LLMs. In Proceedings of the 2024 IEEE/ACM First International Conference on AI Foundation Models and Software Engineering. 103–107.
  34. OpenAI. 2022. ChatGPT. https://openai.com/blog/chatgpt.
  35. Training language models to follow instructions with human feedback. Advances in neural information processing systems 35 (2022), 27730–27744.
  36. LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression. In Findings of the Association for Computational Linguistics ACL 2024, Lun-Wei Ku, Andre Martins, and Vivek Srikumar (Eds.). Association for Computational Linguistics, Bangkok, Thailand and virtual meeting, 963–981. https://aclanthology.org/2024.findings-acl.57
  37. DocuMint: Docstring Generation for Python using Small Language Models. arXiv preprint arXiv:2405.10243 (2024).
  38. Alec Radford. 2018. Improving language understanding by generative pre-training. (2018).
  39. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.
  40. Justus J Randolph. 2005. Free-Marginal Multirater Kappa (multirater K [free]): An Alternative to Fleiss’ Fixed-Marginal Multirater Kappa. Online submission (2005).
  41. An empirical study on usage and perceptions of llms in a software engineering project. In Proceedings of the 1st International Workshop on Large Language Models for Code. 111–118.
  42. Greening large language models of code. In Proceedings of the 46th International Conference on Software Engineering: Software Engineering in Society. 142–153.
  43. Treegen: A tree-based transformer architecture for code generation. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 8984–8991.
  44. Attention is All you Need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5998–6008. https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html
  45. ReCode: Robustness Evaluation of Code Generation Models. In The 61st Annual Meeting Of The Association For Computational Linguistics.
  46. Codet5+: Open code large language models for code understanding and generation. arXiv preprint arXiv:2305.07922 (2023).
  47. Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv preprint arXiv:2109.00859 (2021).
  48. Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652 (2021).
  49. Prompt compression and contrastive conditioning for controllability and toxicity reduction in language models. arXiv preprint arXiv:2210.03162 (2022).
  50. Top Leaderboard Ranking= Top Coding Proficiency, Always? EvoEval: Evolving Coding Benchmarks via LLM. arXiv preprint arXiv:2403.19114 (2024).
  51. How important are good method names in neural code generation? a model robustness perspective. ACM Transactions on Software Engineering and Methodology 33, 3 (2024), 1–35.
  52. Robustness, security, privacy, explainability, efficiency, and usability of large language models for code. arXiv preprint arXiv:2403.07506 (2024).
  53. Pengcheng Yin and Graham Neubig. 2017. A syntactic neural model for general-purpose code generation. arXiv preprint arXiv:1704.01696 (2017).
  54. CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-X. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 5673–5684.
  55. Bigcodebench: Benchmarking code generation with diverse function calls and complex instructions. arXiv preprint arXiv:2406.15877 (2024).

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 1 like.

Upgrade to Pro to view all of the tweets about this paper: