Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 81 tok/s
Gemini 2.5 Pro 44 tok/s Pro
GPT-5 Medium 22 tok/s Pro
GPT-5 High 25 tok/s Pro
GPT-4o 81 tok/s Pro
Kimi K2 172 tok/s Pro
GPT OSS 120B 434 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

INTERVENOR: Prompting the Coding Ability of Large Language Models with the Interactive Chain of Repair (2311.09868v5)

Published 16 Nov 2023 in cs.SE and cs.AI

Abstract: This paper introduces INTERVENOR (INTERactiVE chaiN Of Repair), a system designed to emulate the interactive code repair processes observed in humans, encompassing both code diagnosis and code repair. INTERVENOR prompts LLMs to play distinct roles during the code repair process, functioning as both a Code Learner and a Code Teacher. Specifically, the Code Learner is tasked with adhering to instructions to generate or repair code, while the Code Teacher is responsible for crafting a Chain-of-Repair (CoR) to serve as guidance for the Code Learner. During generating the CoR, the Code Teacher needs to check the generated codes from Code Learner and reassess how to address code bugs based on error feedback received from compilers. Experimental results demonstrate that INTERVENOR surpasses baseline models, exhibiting improvements of approximately 18% and 4.3% over GPT-3.5 in code generation and code translation tasks, respectively. Our further analyses show that CoR is effective to illuminate the reasons behind bugs and outline solution plans in natural language. With the feedback of code compilers, INTERVENOR can accurately identify syntax errors and assertion errors and provide precise instructions to repair codes. All data and codes are available at https://github.com/NEUIR/INTERVENOR

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Anthropic. 2023. Model card and evaluations for claude models.
  2. Program synthesis with large language models.
  3. Codet: Code generation with generated tests. In Proceedings of ICLR.
  4. Evaluating large language models trained on code.
  5. Teaching large language models to self-debug. ArXiv preprint.
  6. Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, pages 240:1–240:113.
  7. Self-collaboration code generation via chatgpt.
  8. Incoder: A generative model for code infilling and synthesis. In Proceedings of ICLR.
  9. Fixeval: Execution-based evaluation of program fixes for programming problems. In 2023 IEEE/ACM International Workshop on Automated Program Repair (APR), pages 11–18. IEEE.
  10. Codecot and beyond: Learning to program and test like a developer. ArXiv preprint.
  11. Review4repair: Code review aided automatic program repairing. Information and Software Technology, page 106765.
  12. Fault-aware neural code rankers. In Proceedings of NeurIPS.
  13. Challenges and applications of large language models. ArXiv preprint.
  14. Large language models are zero-shot reasoners. In Proceedings of NeurIPS, pages 22199–22213.
  15. Structured chain-of-thought prompting for code generation.
  16. Starcoder: may the source be with you! ArXiv preprint.
  17. Competition-level code generation with AlphaCode. Science, (6624):1092–1097.
  18. Encouraging divergent thinking in large language models through multi-agent debate. ArXiv preprint.
  19. Codexglue: A machine learning benchmark dataset for code understanding and generation. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1).
  20. Wizardcoder: Empowering code large language models with evol-instruct. ArXiv preprint.
  21. Self-refine: Iterative refinement with self-feedback.
  22. William J McGuire. 1960. Cognitive consistency and attitude change. The Journal of Abnormal and Social Psychology, (3):345.
  23. Octopack: Instruction tuning code large language models. In NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following.
  24. Lever: Learning to verify language-to-code generation with execution. In Proceedings of ICML, pages 26106–26128.
  25. Codegen: An open large language model for code with multi-turn program synthesis. In Proceedings of ICLR.
  26. Is self-repair a silver bullet for code generation?
  27. OpenAI. 2022. Chatgpt: Optimizing language models for dialogue.
  28. OpenAI. 2023. Gpt-4 technical report.
  29. Communicative agents for software development.
  30. Creator: Tool creation for disentangling abstract and concrete reasoning of large language models. In Proceedings of EMNLP Findings, pages 6922–6939.
  31. Code llama: Open foundation models for code. ArXiv preprint.
  32. Pangu-coder2: Boosting large language models for code with ranking feedback.
  33. Natural language to code translation with execution. In Proceedings of EMNLP, pages 3533–3546.
  34. Reflexion: Language agents with verbal reinforcement learning. In Proceedings of NeurIPS.
  35. Llama: Open and efficient foundation language models.
  36. Compilable neural code generation with compiler feedback. In Proceedings of ACL Findings, pages 9–19.
  37. Self-instruct: Aligning language model with self generated instructions. In Proceedings of ACL, pages 13484–13508.
  38. Codet5+: Open code large language models for code understanding and generation. pages 1069–1088.
  39. Chain-of-thought prompting elicits reasoning in large language models. In Proceedings of NeurIPS, pages 24824–24837.
  40. Generating sequences by learning to self-correct. In Proceedings of ICLR.
  41. Lemur: Harmonizing natural language and code for language agents.
  42. Michihiro Yasunaga and Percy Liang. 2021. Break-it-fix-it: Unsupervised learning for program repair. In Proceedings of ICML, pages 11941–11952.
  43. Self-edit: Fault-aware code editor for code generation. In Proceedings of ACL, pages 769–787.
  44. Coder reviewer reranking for code generation. In Proceedings of ICML, pages 41832–41846.
  45. Codegeex: A pre-trained model for code generation with multilingual evaluations on humaneval-x.
Citations (6)

Summary

We haven't generated a summary for this paper yet.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 post and received 0 likes.