The paper introduces a multi-agent framework designed to enhance the capabilities of LLMs by enabling collaborative interactions among multiple intelligent agents. These agents, each possessing distinct attributes and roles, work together within a defined environment to handle complex tasks more effectively. The framework's utility is demonstrated through case studies involving AGI models, specifically Auto-GPT and BabyAGI, along with an examination of the "Gorilla" model, which integrates external APIs into an LLM. The work addresses challenges such as looping, security, scalability, evaluation, and ethical considerations.
The core idea involves using multiple LLMs, each with diverse characteristics, to improve performance across a range of tasks. The proposed framework uses multiple Intelligent Generative Agents (IGAs) that are equipped with unique attributes and roles. The use of multiple IGAs is premised on the idea that diversity enhances performance by implementing a division of labor where each agent specializes in a specific function. The multi-agent system allows the IGAs to interact and collaborate to achieve a shared goal by creating subtasks, seeking information, and soliciting assistance from each other, and engaging in competitive evaluation.
The paper outlines the environment in which the multi-agent system operates as a black box, represented as a graph .
- is the set of vertices representing the IGAs and plugins.
- is the set of edges representing the connection channels between the agents and the plugins, and between the agents themselves.
Each agent is represented as a tuple , where:
- refers to the LLM instance utilized by the agent, including its type (e.g., GPT-4 or GPT-3.5-turbo) and configuration parameters like temperature.
- defines the agent's role, specifying its responsibilities and duties within the system.
- represents the agent's state, including its current knowledge base and thoughts.
- is a boolean property indicating whether an agent can create new agents.
- is the set of agents that this agent can halt.
Each plugin is represented as a tuple , where:
- is the set of the functionalities of the plugin.
- is the configurations associated with the plugin.
- is the set of usage constraints or conditions that govern the usage of the plugin.
Each message , sent from agent to , is represented as a tuple , where:
- is the content of the message.
- is the action associated with the message.
- is the metadata associated with the message.
The paper details the system design process, which involves determining the number of agents and required plugins, establishing connections, and assigning roles and properties to optimize configuration for efficient collaboration. Dynamic addition of agents is supported, allowing agents with the capability to create new agents to distribute workload and assign responsibilities, enhancing collaboration and workload management. The designer designs the initial framework, but the dynamic addition of agents allows for flexibility and adaptation within the designed system. Feedback mechanisms, including inter-agent feedback and self-feedback, are integral to the system, enabling agents to learn from experiences and adapt strategies for improved performance. The paper introduces the concept of an oracle agent, which operates in a stateless and memory-less manner, performing actions based solely on current input, useful in scenarios independent of previous interactions. A halting mechanism is incorporated, allowing agents to halt other agents under certain conditions, crucial for effective management and coordination and exemplified by a supervisor agent monitoring progress and task lists. The framework allows for an IGA to act as a system designer, defining roles, responsibilities, interactions, and connections, or to refine an already designed system.
The framework's applicability is demonstrated through case studies involving Auto-GPT and BabyAGI. Auto-GPT's architecture is modeled, identifying its main agent, plugins for tasks like internet browsing and memory management, and an oracle agent for summarization and response criticism. Limitations such as getting stuck in loops are addressed by adding a supervisor agent. For BabyAGI, the framework models agents for task creation, prioritization, and execution, along with a plugin for interacting with a vector database. The framework improves upon the current implementation of BabyAGI by providing a more structured and modular approach to designing the system. The "Gorilla" model is analyzed, with the framework customizing it for various use cases by efficiently handling real-time API updates.
To illustrate practical applications, the paper presents case studies in a court simulation and a software development scenario. The court simulation models roles such as judge, jury, and attorneys as agents, each equipped with specific responsibilities and capabilities. Similarly, in software development, agents embody roles like user experience designer, product manager, and software developer, optimizing and streamlining the development process.
The paper acknowledges challenges and limitations inherent in multi-agent systems, including the risk of over-proliferation of agents, scalability issues, complexities in system evaluation, and ethical considerations. Mitigation strategies include resource management modules, coordination mechanisms, and the implementation of ethical guidelines and safeguards.