MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution (2403.17927v2)

Published 26 Mar 2024 in cs.SE and cs.AI

Abstract: In software development, resolving the emergent issues within GitHub repositories is a complex challenge that involves not only the incorporation of new code but also the maintenance of existing code. LLMs have shown promise in code generation but face difficulties in resolving Github issues, particularly at the repository level. To overcome this challenge, we empirically study the reason why LLMs fail to resolve GitHub issues and analyze the major factors. Motivated by the empirical findings, we propose a novel LLM-based Multi-Agent framework for GitHub Issue reSolution, MAGIS, consisting of four agents customized for software evolution: Manager, Repository Custodian, Developer, and Quality Assurance Engineer agents. This framework leverages the collaboration of various agents in the planning and coding process to unlock the potential of LLMs to resolve GitHub issues. In experiments, we employ the SWE-bench benchmark to compare MAGIS with popular LLMs, including GPT-3.5, GPT-4, and Claude-2. MAGIS can resolve 13.94% GitHub issues, significantly outperforming the baselines. Specifically, MAGIS achieves an eight-fold increase in resolved ratio over the direct application of GPT-4, the advanced LLM.

References (42)

Citations (23)

View on Semantic Scholar

Summary

The paper introduces a collaborative multi-agent framework that leverages LLMs to accurately locate code changes and handle complexity in GitHub issues.
It achieves an eight-fold improvement over GPT-4 by integrating specialized roles like Manager, Repository Custodian, Developer, and Quality Assurance.
Empirical analysis validates MAGIS as a significant advancement in AI-driven software evolution, paving the way for future enhancements in code maintenance.

Leveraging LLMs for Enhanced GitHub Issue Resolution: Introducing MAGIS

Introduction to MAGIS

Managing GitHub issues is a significant aspect of software evolution, demanding sophisticated solutions that account for both the introduction of new functionalities and the maintenance of existing ones. Given the prowess of LLMs in code generation and comprehension, their application to software development processes, especially in handling repository-level tasks like GitHub issue resolution, beckons exploration.

In response, we propose MAGIS, an LLM-based Multi-Agent framework for GitHub Issue reSolution. The framework introduces a collaborative mechanism among specialized agents—Manager, Repository Custodian, Developer, and Quality Assurance Engineer—each playing a critical role aimed at facilitating LLMs in overcoming repository-level coding challenges. Our model not only demonstrates a significant improvement over existing LLMs in resolving GitHub issues but also lays the groundwork for future advancements in AI-assisted software evolution.

Empirical Analysis

Our examination reveals two primary factors affecting the performance of LLMs in issue resolution: the accuracy of line location for code changes and the overall complexity of these changes. The findings underscore the pivotal role of precisely identifying code modification locations and managing the complexity of alterations, particularly in settings without Oracle.

The efficacy of our framework in these contexts is evident through a comprehensive comparison against popular LLMs on the SWE-bench. The experiments highlight an eight-fold improvement in the resolved ratio over the base LLM, GPT-4, signifying a robust groundwork for further exploration.

MAGIS Framework: Roles and Collaborative Process

MAGIS introduces an innovative approach, deriving inspiration from traditional human workflows yet distinctly tailored to optimize LLM capabilities. Each agent within our framework performs specific roles—ranging from identifying pertinent files in repositories to ensuring the quality of code changes—which collectively streamline the issue resolution process. This structured collaboration not only enhances the efficiency of LLM applications but also aligns with the established practices of software development, thus bridging the gap between AI potentials and practical requirements.

Experimental Validation and Outcomes

The effectiveness of our framework is validated across various dimensions—overall issue resolution, file location recall, and the intricate processes of planning and coding. Our findings demonstrate that MAGIS considerably outperforms benchmark LLMs in the domain of GitHub issue resolution. Particularly, our approach exhibits a consistent ability to handle complex modifications, often presenting viable solutions that, in certain instances, are more concise than their human-generated counterparts.

Insights and Future Directions

The significant advancements heralded by MAGIS in utilizing LLMs for software evolution emphasize the potential of AI in navigating the complexities of software development. The framework not only showcases the capacity to increase the efficiency of addressing GitHub issues but also sets a substantive foundation for the exploration of AI's role in broader aspects of software maintenance and evolution.

Moreover, the introduction of a collaborative multi-agent system paves the way for future research into optimizing these interactions and further leveraging AI capabilities in software development processes. As LLMs continue to evolve, frameworks like MAGIS could become integral components of the software development lifecycle, augmenting human efforts with AI-driven insights and solutions.

In conclusion, MAGIS represents a significant stride toward harnessing the power of LLMs in software evolution, highlighting the immense potential that lies in the intersection of AI and software development. The journey forward is promising, with MAGIS providing a beacon for future endeavors in this evolving landscape.

PDF Markdown

Related Papers

Tweets

https://twitter.com/emollick/status/1774627104975450180

https://twitter.com/sebkrier/status/1774496937481314629

https://twitter.com/antonosika/status/1775629956070232112

https://twitter.com/imtaowei/status/1772819302103883820

https://twitter.com/amoufarek/status/1774600645900718239

https://twitter.com/WeiLiu99/status/1772824438083637654

HackerNews

Magis: LLM-Based Multi-Agent Framework for GitHub Issue ReSolution (3 points, 0 comments)