Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

72 tokens/sec

GPT-4o

61 tokens/sec

Gemini 2.5 Pro Pro

44 tokens/sec

o3 Pro

8 tokens/sec

GPT-4.1 Pro

50 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

416 2 248

Large Language Model-Based Agents for Software Engineering: A Survey (2409.02977v1)

Published 4 Sep 2024 in cs.SE and cs.AI

Abstract: The recent advance in LLMs has shaped a new paradigm of AI agents, i.e., LLM-based agents. Compared to standalone LLMs, LLM-based agents substantially extend the versatility and expertise of LLMs by enhancing LLMs with the capabilities of perceiving and utilizing external resources and tools. To date, LLM-based agents have been applied and shown remarkable effectiveness in Software Engineering (SE). The synergy between multiple agents and human interaction brings further promise in tackling complex real-world SE problems. In this work, we present a comprehensive and systematic survey on LLM-based agents for SE. We collect 106 papers and categorize them from two perspectives, i.e., the SE and agent perspectives. In addition, we discuss open challenges and future directions in this critical domain. The repository of this survey is at https://github.com/FudanSELab/Agent4SE-Paper-List.

PDF HTML Abstract

LLM-Based Agents for Software Engineering: A Survey

Introduction

The rapid advancement in LLMs has engendered a novel paradigm in AI, specifically, the development of LLM-based agents. Unlike isolated LLMs, these agents significantly broaden the capabilities of LLMs, enabling the perception and utilization of external resources and tools. This paper provides a thorough survey of LLM-based agents applied to Software Engineering (SE), categorizing 106 collected papers from both SE and agent perspectives. The survey identifies current challenges and suggests future research directions.

SE Perspective

From the SE perspective, the paper analyzes how LLM-based agents are utilized across various phases of the software life cycle, including requirements engineering, code generation, static code checking, testing, debugging, and end-to-end software development and maintenance.

Requirements Engineering (RE)

LLM-based agents have demonstrated their utility in automating multiple phases of RE, such as elicitation, specification, and verification. For instance, Elicitron dynamically generates requirements by simulating user interactions while SpecGen creates Java Modeling Language specifications validated through OpenJML. Multi-agent frameworks like MARE cover multiple RE stages, including requirement elicitation, modeling, and verification.

Code Generation

LLM-based agents extend beyond standalone LLMs by incorporating planning and iterative refinement mechanisms to generate more accurate code. Strategies like Chain-of-Thought (CoT) planning decompose tasks into sub-tasks, enhancing effectiveness. Moreover, iterative feedback from tools, models, or humans refines the generated code. Agents like CodeCoT and CodePlan dynamically adapt their strategies based on hybrid feedback mechanisms combining model and tool feedback.

Static Code Checking

Static code checking benefits from multi-agent collaboration and the integration of static analysis tools. For instance, ART leverages tool libraries to enhance LLMs for static bug detection. IRIS and LLIFT combine traditional static analysis with LLM agents to pinpoint vulnerabilities and bugs. These agents dynamically navigate code repositories and validate static anomalies reported by tools.

Testing

In software testing, agents generate unit and system-level tests iteratively, refining them to minimize errors and maximize coverage. For example, TestPilot refines tests by analyzing error messages iteratively, while CoverUp focuses on generating high-coverage tests. System-level testing agents like KernelGPT and WhiteFox incorporate code parsers and dynamic execution tools to validate tests on OS kernels and compilers, respectively.

Debugging

Existing LLM-based agents like RepairAgent and AutoSD employ iterative refinement for program repair, incorporating compilation and execution feedback. Simplifying fault localization benefits from tool integration with spectrum-based methodologies as in AUTOFL. Unified debugging approaches further combine fault localization and program repair, e.g., FixAgent uses inter-agent collaboration to enhance debugging capabilities.

End-to-end Development and Maintenance

Agents facilitate complete software development and maintenance processes, leveraging models like the waterfall process model. Systems such as MetaGPT and AgileCoder simulate real-world development teams, incorporating multiple specialized roles like coders, testers, and managers. These agents dynamically collaborate, allocate tasks, and refine outputs iteratively.

Agent Perspective

From the agent perspective, the paper categorizes existing LLM-based agents into four key components: planning, memory, perception, and action.

Planning

Planning involves structuring and scheduling task execution. Some agents adopt single-path planning, generating a linear task sequence, while others implement multi-path strategies like MapCoder to explore various solutions. The representation of plans ranges from natural language to semi-structured formats.

Memory

Effective memory management is crucial in SE tasks requiring iterative refinement. Memory types include short-term (e.g., action-observation sequences) and long-term (e.g., distilled task trajectories). Shared and specific memory mechanisms help agents retain context and historical information, vital for coherent task execution.

Perception

Agents primarily rely on textual input perception, aligning with the text-rich nature of SE activities. Some agents also incorporate visual input for GUI tasks, utilizing image recognition models.

Action

The action component leverages external tools to extend agent capabilities beyond text generation. Tools include search engines, static analysis tools, testing frameworks, and dynamic instrumentation tools, which facilitate comprehensive SE task automation.

Future Directions

The survey highlights several open challenges and future research directions:

Evaluation Metrics and Benchmarks: Developing comprehensive, fine-grained metrics and realistic benchmarks is critical for meaningful evaluations.
Human-Agent Collaboration: Extending human participation across the software life cycle and designing effective interaction mechanisms are key areas for future exploration.
Perception Modality: Broadening the range of perception modalities can improve agent flexibility and adaptability.
Expanding SE Tasks: Developing agents tailored to underexplored SE tasks like design and verification can enhance their utility.
Training Specialized LLMs: Incorporating diverse software lifecycle data into LLM training can create more robust models for SE agents.
Integrating SE Expertise: Leveraging domain-specific SE techniques and methodologies can improve the efficiency and effectiveness of agent systems.

Conclusion

This survey provides a comprehensive analysis of the current landscape of LLM-based agents for SE. The paper explores the utilization of these agents across various SE activities and discusses the design of their core components. By addressing open challenges and outlining future research directions, this survey offers a roadmap for advancing the development and application of LLM-based agents in software engineering.

PDF Markdown Bookmark Chat (Pro)

References (246)

Authors (7)

Junwei Liu (71 papers)
Kaixin Wang (30 papers)
Yixuan Chen (19 papers)
Xin Peng (82 papers)
Zhenpeng Chen (39 papers)
Lingming Zhang (48 papers)
Yiling Lou (28 papers)

Citations (14)

View on Semantic Scholar

GitHub

GitHub - FudanSELab/Agent4SE-Paper-List: Repository for the paper "Large Language Model-Based Agents for Software Engineering: A Survey". (248 stars)

Tweets

https://twitter.com/yiling__LOU/status/1833184834023559208

https://twitter.com/LingmingZhang/status/1833189650284077285

https://twitter.com/omarsar0/status/1832115557749121385

https://twitter.com/_reachsumit/status/1831903907637031142

https://twitter.com/hive_echo/status/1836844092317405357

https://twitter.com/ComputerPapers/status/1831934672177185026

YouTube

Show All Videos

Large Language Model-Based Agents for Software Engineering: A Survey (2409.02977v1)

LLM-Based Agents for Software Engineering: A Survey

Related Papers

GitHub

Tweets

YouTube