Self-collaboration Code Generation via ChatGPT (2304.07590v3)

Published 15 Apr 2023 in cs.SE

Abstract: Although LLMs have demonstrated remarkable code-generation ability, they still struggle with complex tasks. In real-world software development, humans usually tackle complex tasks through collaborative teamwork, a strategy that significantly controls development complexity and enhances software quality. Inspired by this, we present a self-collaboration framework for code generation employing LLMs, exemplified by ChatGPT. Specifically, through role instructions, 1) Multiple LLM agents act as distinct `experts', each responsible for a specific subtask within a complex task; 2) Specify the way to collaborate and interact, so that different roles form a virtual team to facilitate each other's work, ultimately the virtual team addresses code generation tasks collaboratively without the need for human intervention. To effectively organize and manage this virtual team, we incorporate software-development methodology into the framework. Thus, we assemble an elementary team consisting of three LLM roles (i.e., analyst, coder, and tester) responsible for software development's analysis, coding, and testing stages. We conduct comprehensive experiments on various code-generation benchmarks. Experimental results indicate that self-collaboration code generation relatively improves 29.9%-47.1% Pass@1 compared to the base LLM agent. Moreover, we showcase that self-collaboration could potentially enable LLMs to efficiently handle complex repository-level tasks that are not readily solved by the single LLM agent.

PDF Abstract

Self-Collaboration Code Generation via ChatGPT

The paper "Self-collaboration Code Generation via ChatGPT" presents an innovative framework for enhancing code generation capabilities of LLMs, specifically using ChatGPT. The authors propose a self-collaboration approach inspired by human collaborative strategies in software development, where complex tasks are divided and managed through teamwork—ultimately improving software quality and overcoming development complexity.

Overview of the Self-Collaboration Framework

The self-collaboration framework entails two main parts: division of labor and collaboration. Division of labor is achieved by assigning multiple LLM agents distinct roles, with each role responsible for a specific subtask within the broader code generation task. The roles are divided into analyst, coder, and tester, representing the stages of analysis, coding, and testing in software development. Role instructions guide these agents to ensure they think and perform tasks from the perspective of their assigned roles, acting as domain 'experts.'

In the collaboration phase, interaction among roles occurs via natural language. These interactions facilitate mutual enhancement of each agent’s outputs, promoting a cohesive virtual team that addresses tasks cooperatively without human intervention. Coordination is managed using a shared blackboard, where roles exchange necessary information to refine and update the final code iteratively.

Experimental Results and Findings

The framework was evaluated on several benchmarks, including MBPP, HumanEval, and APPS. The self-collaboration approach based on ChatGPT (GPT-3.5) achieved significant improvements over the base model. Specifically, it enhanced the Pass@1 rate by 29.9\% to 47.1%. These improvements were more pronounced in datasets featuring extended test cases, indicating enhanced reliability in generating code that passes comprehensive testing.

Comparative analysis with other prompting approaches and even LLMs customized for code, such as CodeX and CodeGeeX, showcased the superior performance of the self-collaboration framework. This suggests that incorporating collaboration methodologies similar to human teamwork can effectively bolster the capabilities of LLMs, especially for complex tasks that single agents struggle to solve.

Implications and Future Directions

Practically, the self-collaboration framework could lead to more efficient software development processes by relying less on human oversight and improving the robustness of generated code. Theoretically, this introduces a novel perspective on using LLMs collectively, leveraging their individual strengths to form highly effective virtual teams.

For future work, exploring more diverse team structures or introducing human oversight in crucial stages could further control the quality and consistency of the outputs in real-world applications. Additionally, integrating this approach with other software development methodologies or extending it to multi-agent collaborations that span various domains beyond software engineering could realize broad applications in AI advancements.

In conclusion, the paper provides a compelling argument and substantial empirical evidence for the benefits of self-collaboration in automating software development tasks. It identifies a promising direction for future research in collective intelligence among AI systems, paving the way for systematic advancements in this field.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Yihong Dong (35 papers)
Xue Jiang (82 papers)
Zhi Jin (160 papers)
Ge Li (213 papers)

Citations (192)

View on Semantic Scholar

Related Papers

Find Related Papers

Tweets

https://twitter.com/ComputerPapers/status/1790400276685361653