Agents in Software Engineering: Survey, Landscape, and Vision
This paper, "Agents in Software Engineering: Survey, Landscape, and Vision," authored by Yanxian Huang et al., provides a comprehensive survey and analysis of the integration of LLMs with agent technologies to optimize a variety of tasks in the field of software engineering (SE). The paper presents a nuanced conceptual framework for LLM-based agents in SE, identifies existing challenges, and proposes future research opportunities.
Core Framework and Components
The framework proposed for LLM-based agents in SE is principally organized into three interconnected modules: perception, memory, and action.
- Perception Module: This module connects the LLM-based agent to the external environment. It processes inputs of different modalities such as textual, visual, and auditory input, and transforms these into formats understandable and processable by the LLM. The paper points out the current inclination towards token-based textual inputs, overlooking the utilization of tree/graph-based inputs which could better capture the structural characteristics of code.
- Memory Module: This component consists of semantic, episodic, and procedural memory. Semantic memory is maintained using external knowledge retrieval bases containing documents, APIs, and other code-related knowledge. Episodic memory involves data from previous interactions and decision-making processes, utilized for in-context learning. Procedural memory encompasses long-term knowledge stored both implicitly in the LLM's weights and explicitly in agent code.
- Action Module: Actions of the LLM-based agent are classified into internal and external actions. Internal actions involve reasoning, retrieval, and learning. Reasoning relies on methods like Chain-of-Thought and structured CoT for detailed cognitive processing. Retrieval actions aid in fetching relevant information from knowledge bases to assist reasoning. Learning actions enhance the agent’s implicit and explicit knowledge through continual updates. External actions involve interactions with humans, other agents, and digital environments like compilers and search engines, providing iterative feedback and additional knowledge.
Analysis of Challenges and Opportunities
The paper provides a detailed analysis of the challenges faced when integrating LLM-based agents in software engineering, along with identifying promising directions for future research:
- Perception Module Exploration: Existing efforts predominantly focus on token-based textual inputs and fall short in addressing other modalities like tree/graph-based, visual, and auditory inputs. Exploring these alternative input modalities could enhance the agents' comprehensiveness and effectiveness.
- Role-playing Abilities: Many SE tasks require agents to perform multiple roles simultaneously. The current models lack flexibility in assuming diverse roles and balancing multiple roles effectively. Developing mechanisms to extend the role-playing capabilities of these agents is crucial.
- Knowledge Retrieval Base: There is an absence of an established, rich, and reliable code-specific knowledge base in SE, analogous to repositories like Wikipedia in NLP. Constructing such an exhaustive and authoritative knowledge base could immensely contribute to the efficiency and accuracy of LLM-based agents.
- Hallucinations in LLM-based Agents: LLM-based agents often produce hallucinations, especially around generating synthetic APIs. Identifying the root causes and mitigating these hallucinations is essential for improving the reliability of agents.
- Efficiency of Multi-agent Collaboration: The collaboration between multiple agents can be computationally demanding and incurs communication overhead. There is a need for techniques that optimize resource allocation, reduce communication costs, and enhance overall efficiency in multi-agent systems.
- Integration of SE Technologies: Advanced SE techniques can significantly boost the functionality and performance of LLM-based agents. The integration of such technologies is underexplored and offers a promising avenue for future investigation.
Conclusion
The survey conducted by Huang et al. contributes significantly to the field by dissecting the integration of LLM-based agent technologies in software engineering. By categorizing the related works into a framework comprising perception, memory, and action modules, the authors have provided a structured understanding of the current landscape. The identified challenges and future opportunities pave the way for subsequent research efforts aimed at improving and expanding the capabilities of LLM-based agents in SE. The cross-pollination between the fields of SE and LLM-based agents underscores a symbiotic relationship that could drive future technological advancements.