Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback (2403.16792v3)

Published 25 Mar 2024 in cs.CL and cs.SE

Abstract: LLMs have shown remarkable progress in automated code generation. Yet, LLM-generated code may contain errors in API usage, class, data structure, or missing project-specific information. As much of this project-specific context cannot fit into the prompts of LLMs, we must find ways to allow the model to explore the project-level code context. We present CoCoGen, a new code generation approach that uses compiler feedback to improve the LLM-generated code. CoCoGen first leverages static analysis to identify mismatches between the generated code and the project's context. It then iteratively aligns and fixes the identified errors using information extracted from the code repository. We integrate CoCoGen with two representative LLMs, i.e., GPT-3.5-Turbo and Code Llama (13B), and apply it to Python code generation. Experimental results show that CoCoGen significantly improves the vanilla LLMs by over 80% in generating code dependent on the project context and consistently outperforms the existing retrieval-based code generation baselines.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. C Amazon. 2023. Ai code generator—amazon codewhisperer.
  2. The fortran automatic coding system. In Papers Presented at the February 26-28, 1957, Western Joint Computer Conference: Techniques for Reliability, IRE-AIEE-ACM ’57 (Western), page 188–198, New York, NY, USA. Association for Computing Machinery.
  3. Deepseek llm: Scaling open-source language models with longtermism. arXiv preprint arXiv:2401.02954.
  4. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374.
  5. Cocomic: Code completion by jointly modeling in-file and cross-file context. arXiv preprint arXiv:2212.10007.
  6. Classeval: A manually-crafted benchmark for evaluating llms on class-level code generation. arXiv preprint arXiv:2308.01861.
  7. Nat Friedman. 2021. Introducing github copilot: your ai pair programmer. URL https://github. blog/2021-06-29-introducing-github-copilot-ai-pair-programmer.
  8. Logic codes generation and transmission using an encoding-decoding system. International Journal of Advances in Engineering & Technology, 5(2):37.
  9. Google. 2024. Gemini. https://deepmind.google/technologies/gemini/. [Online; accessed 1-Feb-2024].
  10. Mary Jean Harrold and Gregg Rothermel. 1996. Syntax-directed construction of program dependence graphs. Technical Report OSU-CISRC-5/96-TR32.
  11. Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906.
  12. Deveval: Evaluating code generation in practical software projects. arXiv preprint arXiv:2401.06401.
  13. Starcoder: may the source be with you! arXiv preprint arXiv:2305.06161.
  14. Competition-level code generation with alphacode. Science, 378(6624):1092–1097.
  15. Context-aware code generation framework for code repositories: Local, global, and third-party library awareness. arXiv preprint arXiv:2312.05772.
  16. Repobench: Benchmarking repository-level code auto-completion systems. arXiv preprint arXiv:2306.03091.
  17. Reacc: A retrieval-augmented code completion framework. arXiv preprint arXiv:2203.07722.
  18. Embedding api dependency graph for neural code generation. Empirical Software Engineering, 26:1–51.
  19. Microsoft. 2024. Microsoft Copilot. https://www.microsoft.com/zh-cn/microsoft-copilot. [Online; accessed 1-Feb-2024].
  20. Codegen: An open large language model for code with multi-turn program synthesis. arXiv preprint arXiv:2203.13474.
  21. OpenAI. 2023a. chatgpt. http://chat.openai.com. [Online; accessed 1-Feb-2023].
  22. OpenAI. 2023b. New and improved embedding model. https://openai.com/blog/new-and-improved-embedding-model. [Online; accessed 1-Feb-2023].
  23. Codetrek: Flexible modeling of code using an extensible relational representation.
  24. Code llama: Open foundation models for code. arXiv preprint arXiv:2308.12950.
  25. Large language models can be easily distracted by irrelevant context. In International Conference on Machine Learning, pages 31210–31227. PMLR.
  26. Repository-level prompt generation for large language models of code. In International Conference on Machine Learning, pages 31693–31715. PMLR.
  27. Tabnine. 2024. Tabnine. https://www.tabnine.com/. [Online; accessed 1-Feb-2024].
  28. Attention is all you need. In Proceedings of Advances in neural information processing systems, pages 5998–6008.
  29. Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 8696–8708.
  30. W. A. Woods. 1973. Progress in natural language understanding: an application to lunar geology. In Proceedings of the June 4-8, 1973, National Computer Conference and Exposition, AFIPS ’73, page 441–450, New York, NY, USA. Association for Computing Machinery.
  31. Codereval: A benchmark of pragmatic code generation with generative pre-trained models. arXiv preprint arXiv:2302.00288.
  32. John M. Zelle and Raymond J. Mooney. 1996. Learning to parse database queries using inductive logic programming. In Proceedings of the Thirteenth National Conference on Artificial Intelligence - Volume 2, AAAI’96, page 1050–1055. AAAI Press.
  33. Luke S. Zettlemoyer and Michael Collins. 2005. Learning to map sentences to logical form: structured classification with probabilistic categorial grammars. In Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence, UAI’05, page 658–666, Arlington, Virginia, USA. AUAI Press.
  34. Repocoder: Repository-level code completion through iterative retrieval and generation. arXiv preprint arXiv:2303.12570.
  35. Learning to generate code comments from class hierarchies. arXiv preprint arXiv:2103.13426.
  36. Coder reviewer reranking for code generation. In International Conference on Machine Learning, pages 41832–41846. PMLR.
  37. Codegeex: A pre-trained model for code generation with multilingual evaluations on humaneval-x. arXiv preprint arXiv:2303.17568.
  38. Docprompting: Generating code by retrieving the docs. In The Eleventh International Conference on Learning Representations.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Zhangqian Bi (7 papers)
  2. Yao Wan (70 papers)
  3. Zheng Wang (400 papers)
  4. Hongyu Zhang (147 papers)
  5. Batu Guan (3 papers)
  6. Fangxin Lu (1 paper)
  7. Zili Zhang (25 papers)
  8. Yulei Sui (29 papers)
  9. Xuanhua Shi (20 papers)
  10. Hai Jin (83 papers)
Citations (3)
Youtube Logo Streamline Icon: https://streamlinehq.com