- The paper presents a comprehensive survey of LLM-based agents in software engineering, organizing them into perception, memory, and action modules.
- It employs systematic categorization and analysis techniques to highlight challenges such as perception diversification and hallucination mitigation.
- The study proposes actionable future research directions to enhance multi-agent collaboration and integration with software engineering methodologies.
Agents in Software Engineering: Survey, Landscape, and Vision
Introduction
The integration of LLMs within the field of software engineering (SE) represents a burgeoning area of research, characterized by substantial contributions, yet also marked by challenges and opportunities for further exploration. Recent efforts have illuminated various applications of LLM-based agents across SE, leveraging the intelligence and adaptability of agents for diverse tasks such as code summarization, generation, and translation. This paper conducts a rigorous survey of the literature, systematically categorizing the functionality of LLM-based agents into three principal modules: perception, memory, and action. Additionally, the paper identifies key challenges and proposes a roadmap for future advancements.
LLM-based Agents Framework
Perception Module
The perception module acts as the sensory interface, enabling LLM-based agents to interpret various forms of input. SE applications necessitate the adaptation of input processing to accommodate both natural language and code, which can be token-based, tree/graph-based, or hybrid-formatted. Existing research predominantly focuses on token-based inputs, demanding more profound exploration into tree/graph-based and other multimodal perceptions, including visual and auditory types, which remain underexplored in current research literature.
Figure 1: An overview of agent framework in SE.
Memory Module
Memory modules in LLM-based agents comprise semantic, episodic, and procedural memory types, facilitating complex reasoning and decision-making processes:
- Semantic Memory: Encompasses an external knowledge base that enriches the agent's cognitive repository with essential world knowledge, often drawn from documents, libraries, and APIs. Its continuous expansion through carefully curated knowledge bases remains pivotal for enhancing agent performance.
- Episodic Memory: Extends the contextual awareness of agents through historical data and decision timelines, aiding in dynamic adaptation and contextual reasoning.
- Procedural Memory: Distinguishes between explicit knowledge hard-coded into the agent and implicit knowledge embedded within LLM parameters. Techniques for fine-tuning LLM weights to refine implicit knowledge present ongoing computational challenges.
Action Module
The action module's dual-faceted internal and external pathways depict agent interaction and adaptation. Internal actions like reasoning and retrieval are refined by CoT methods and advanced retrieval techniques, while external actions involve interactions with humans or digital environments to provide feedback and enhance agent learning.
Figure 2: Different CoTs from different methods, illustrating various strategies for improving reasoning via structured formats.
Challenges and Opportunities
The paper highlights several challenges impeding the advancement of LLM-based agents in SE:
Conclusion
This survey meticulously outlines the framework and operational dynamics of LLM-based agents in software engineering, identifying key structural components and inherent challenges. By proposing a comprehensive model and suggesting avenues for future research, the paper lays the groundwork for the ongoing evolution of intelligent agents in SE, aimed at addressing computational challenges and further enhancing the landscape of software development and engineering.