Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 97 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 21 tok/s Pro
GPT-5 High 18 tok/s Pro
GPT-4o 92 tok/s Pro
GPT OSS 120B 468 tok/s Pro
Kimi K2 175 tok/s Pro
2000 character limit reached

Anemoi: A Semi-Centralized Multi-agent System Based on Agent-to-Agent Communication MCP server from Coral Protocol (2508.17068v2)

Published 23 Aug 2025 in cs.MA and cs.CL

Abstract: Recent advances in generalist multi-agent systems (MAS) have largely followed a context-engineering plus centralized paradigm, where a planner agent coordinates multiple worker agents through unidirectional prompt passing. While effective under strong planner models, this design suffers from two critical limitations: (1) strong dependency on the planner's capability, which leads to degraded performance when a smaller LLM powers the planner; and (2) limited inter-agent communication, where collaboration relies on costly prompt concatenation and context injection, introducing redundancy and information loss. To address these challenges, we propose Anemoi, a semi-centralized MAS built on the Agent-to-Agent (A2A) communication MCP server from Coral Protocol. Unlike traditional designs, Anemoi enables structured and direct inter-agent collaboration, allowing all agents to monitor progress, assess results, identify bottlenecks, and propose refinements in real time. This paradigm reduces reliance on a single planner, supports adaptive plan updates, and minimizes redundant context passing, resulting in more scalable and cost-efficient execution. Evaluated on the GAIA benchmark, Anemoi achieved 52.73% accuracy with a small LLM (GPT-4.1-mini) as the planner, surpassing the strongest open-source baseline OWL (43.63%) by +9.09% under identical LLM settings. Our implementation is publicly available at https://github.com/Coral-Protocol/Anemoi.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

  • The paper introduces a semi-centralized MAS that uses structured A2A communication via an MCP server to overcome centralized planning limitations.
  • It demonstrates a +9.09% performance improvement over the OWL system on the GAIA benchmark, emphasizing collaborative refinement and reduced context redundancy.
  • The architecture minimizes token overhead and latency while enabling scalable, modular integration of diverse agent types for real-world applications.

Anemoi: A Semi-Centralized Multi-Agent System Leveraging Agent-to-Agent Communication via MCP Server

Introduction and Motivation

The Anemoi framework introduces a semi-centralized multi-agent system (MAS) architecture, fundamentally diverging from the prevailing context-engineering plus centralized paradigm in generalist MAS. Traditional systems rely on a strong planner agent to coordinate worker agents through unidirectional prompt passing, which results in two major limitations: (1) performance degradation when the planner is powered by a smaller LLM, and (2) inefficient inter-agent collaboration due to prompt concatenation and context injection, leading to redundancy and information loss. Anemoi addresses these issues by enabling structured, direct agent-to-agent (A2A) communication via the MCP server from Coral Protocol, allowing all agents to monitor progress, assess results, and propose refinements in real time. Figure 1

Figure 1: Architecture of the Anemoi: a semi-centralized multi-agent system based on the A2A communication MCP server from Coral Protocol.

System Architecture and Communication Protocol

Anemoi's architecture is built around a dedicated MCP server that facilitates thread-based communication among agents. Each agent connects to the MCP server, which provides primitives for agent discovery, thread management, and message exchange. Threads serve as contextual compartments, ensuring that messages remain within their respective conversations and supporting directed queries and task delegation.

The agent composition includes:

  • Planner Agent: Generates the initial plan and initiates coordination.
  • Critique Agent: Evaluates contributions for validity and certainty.
  • Answer-Finding Agent: Compiles and submits the final response.
  • Web Agent: Executes web searches and simulates browser actions.
  • Document Processing Agent: Handles diverse document formats.
  • Reasoning/Coding Agent: Specializes in reasoning, coding, and offline data processing.

All agents are integrated with the MCP toolkit, enabling them to monitor progress, track step completion, and propose new ideas throughout execution. The communication pattern supports dynamic plan refinement and consensus-based answer submission, reducing reliance on a single planner and minimizing redundant context passing. Figure 2

Figure 2: Overview of Anemoi. The system includes a planning agent to make initial plan, and a set of agents with different capability. The A2A communication MCP server enables all agents to monitor progress together.

Experimental Evaluation

Benchmark and Baselines

Anemoi was evaluated on the GAIA benchmark, which comprises real-world, multi-step tasks requiring web search, multi-modal file processing, and coding capabilities. The experimental setup ensured parity with the strongest open-source baseline, OWL, by using identical worker agent configurations and toolkits. The planner agent in Anemoi was powered by GPT-4.1-mini, while worker agents used GPT-4o. This configuration was chosen to highlight the robustness of the semi-centralized paradigm under weaker planner models.

Performance Results

Anemoi achieved an accuracy of 52.73% on the GAIA validation set (pass@3), outperforming OWL (43.63%) by +9.09 percentage points under identical LLM settings. Notably, Anemoi also surpassed several proprietary and open-source frameworks that employed stronger LLMs, demonstrating the efficacy of the A2A communication paradigm in mitigating the limitations of context-engineering-based coordination.

Comparative Task Attribution and Error Analysis

Task Attribution Analysis

A detailed comparison of task attribution between Anemoi and OWL revealed that Anemoi's additional successes were primarily due to collaborative refinement (52%) and reduced context redundancy (8%), with the remainder attributed to stochastic worker behavior (40%). Conversely, OWL's successes over Anemoi were predominantly due to stochastic worker behavior (90%) and, to a lesser extent, communication latency (10%). Figure 3

Figure 3: Comparison of task attribution categories between Anemoi and OWL. The donut chart illustrates the distribution of reasons why Anemoi succeeded where OWL failed, and vice versa.

Error Analysis

Anemoi's remaining errors were analyzed, with the largest fraction attributed to LLM capability limitations (45.6%), followed by toolkit limitations (20.6%), incorrect plans (11.8%), communication latency (10.3%), annotation mistakes (7.4%), and LLM hallucinations (4.4%). The error profile underscores the importance of further improving agent toolkits and LLM reliability, as well as optimizing communication latency in agent orchestration. Figure 4

Figure 4: Remaining errors of the Anemoi.

Implementation Considerations and Trade-offs

The Anemoi system demonstrates that semi-centralized coordination via A2A communication can sustain high performance even with weaker planner models, provided that worker agents are sufficiently capable. The thread-based MCP server architecture offers contextual isolation and efficient message routing, reducing token overhead and improving scalability. However, the system's performance is still bounded by the capabilities of the underlying LLMs and toolkits, and communication latency can impact task completion in time-sensitive scenarios.

Resource requirements are moderate, as the MCP server can be deployed on standard cloud infrastructure, and agent orchestration scales linearly with the number of agents. The modular design facilitates integration of new agent types and toolkits, supporting extensibility for domain-specific applications.

Implications and Future Directions

The Anemoi framework advances the state of MAS by demonstrating that direct, structured inter-agent communication can overcome the bottlenecks of centralized planning and context engineering. The empirical results suggest that future MAS architectures should prioritize adaptive, consensus-driven coordination and minimize reliance on prompt concatenation. Further research should focus on enhancing agent toolkits, improving LLM reliability, and optimizing communication protocols to reduce latency.

Potential future developments include:

  • Integration of more diverse agent types (e.g., multimodal reasoning, external API agents).
  • Exploration of decentralized consensus mechanisms for fully distributed MAS.
  • Application of Anemoi in real-world domains such as autonomous research, enterprise automation, and collaborative robotics.

Conclusion

Anemoi introduces a robust semi-centralized MAS architecture leveraging A2A communication via the MCP server, enabling scalable, adaptive, and cost-efficient agent coordination. The system achieves strong empirical performance on the GAIA benchmark, particularly under weaker planner models, and provides a blueprint for future MAS designs that emphasize direct inter-agent collaboration and dynamic plan refinement. The results highlight the practical and theoretical advantages of structured agent communication, marking a significant step toward scalable, generalist multi-agent AI systems.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Youtube Logo Streamline Icon: https://streamlinehq.com