Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 97 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 29 tok/s
GPT-5 High 26 tok/s Pro
GPT-4o 86 tok/s
GPT OSS 120B 452 tok/s Pro
Kimi K2 211 tok/s Pro
2000 character limit reached

Anemoi: A Semi-Centralized Multi-agent System Based on Agent-to-Agent Communication MCP server from Coral Protocol (2508.17068v2)

Published 23 Aug 2025 in cs.MA and cs.CL

Abstract: Recent advances in generalist multi-agent systems (MAS) have largely followed a context-engineering plus centralized paradigm, where a planner agent coordinates multiple worker agents through unidirectional prompt passing. While effective under strong planner models, this design suffers from two critical limitations: (1) strong dependency on the planner's capability, which leads to degraded performance when a smaller LLM powers the planner; and (2) limited inter-agent communication, where collaboration relies on costly prompt concatenation and context injection, introducing redundancy and information loss. To address these challenges, we propose Anemoi, a semi-centralized MAS built on the Agent-to-Agent (A2A) communication MCP server from Coral Protocol. Unlike traditional designs, Anemoi enables structured and direct inter-agent collaboration, allowing all agents to monitor progress, assess results, identify bottlenecks, and propose refinements in real time. This paradigm reduces reliance on a single planner, supports adaptive plan updates, and minimizes redundant context passing, resulting in more scalable and cost-efficient execution. Evaluated on the GAIA benchmark, Anemoi achieved 52.73% accuracy with a small LLM (GPT-4.1-mini) as the planner, surpassing the strongest open-source baseline OWL (43.63%) by +9.09% under identical LLM settings. Our implementation is publicly available at https://github.com/Coral-Protocol/Anemoi.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

  • The paper introduces Anemoi, a semi-centralized multi-agent system using A2A MCP for real-time, collaborative plan refinement.
  • It demonstrates a 9.09% performance improvement over the OWL baseline on the GAIA benchmark with reduced context redundancy.
  • The system minimizes dependency on a powerful planner by enabling heterogeneous agents to communicate directly and efficiently.

Anemoi: A Semi-Centralized Multi-Agent System with Direct Agent-to-Agent Communication

Introduction

The paper presents Anemoi, a semi-centralized multi-agent system (MAS) that leverages the Agent-to-Agent (A2A) communication Model Context Protocol (MCP) server from Coral Protocol to address the limitations of traditional context-engineering-based, centralized MAS architectures. The primary motivation is to reduce the dependency on a single, powerful planner LLM and to enable more efficient, scalable, and robust agent collaboration through structured, real-time inter-agent communication. The system is evaluated on the GAIA benchmark, demonstrating significant improvements over state-of-the-art open-source baselines, particularly when the planner is a smaller LLM. Figure 1

Figure 1: Architecture of the Anemoi: a semi-centralized multi-agent system based on the A2A communication MCP server from Coral Protocol.

Background and Motivation

Traditional MAS frameworks typically employ a centralized planner that decomposes tasks and coordinates worker agents via unidirectional prompt passing. This approach, while effective with strong LLMs, suffers from two critical drawbacks:

  1. Planner Dependency: System performance is tightly coupled to the planner's LLM capability. Substituting a strong LLM with a smaller one leads to substantial performance degradation.
  2. Limited Inter-Agent Communication: Collaboration is realized through prompt concatenation and manual context injection, resulting in high token overhead, redundancy, and information loss.

Anemoi is designed to overcome these bottlenecks by introducing a semi-centralized architecture where all agents can directly communicate, monitor progress, and collaboratively refine plans in real time.

System Architecture

Anemoi's architecture is built around the A2A communication MCP server, which provides thread-based, structured communication primitives for agent discovery, thread management, and message exchange. The system comprises the following agent types:

  • Planner Agent: Generates the initial plan and initiates coordination.
  • Critique Agent: Continuously evaluates agent outputs for validity.
  • Answer-Finding Agent: Compiles and submits the final answer.
  • Web Agent: Handles web search and online information retrieval.
  • Document Processing Agent: Processes various document formats.
  • Reasoning/Coding Agent: Specializes in reasoning, coding, and offline computation. Figure 2

    Figure 2: Overview of Anemoi. The system includes a planning agent to make initial plan, and a set of agents with different capability. The A2A communication MCP server enables all agents to monitor progress together.

Each agent is integrated with the MCP toolkit, enabling dynamic participation in communication threads, direct message passing, and real-time monitoring of task progress.

Communication Protocol and Workflow

The A2A MCP server exposes a set of primitives (list_agents, create_thread, send_message, etc.) that facilitate structured, thread-based communication. The workflow proceeds as follows:

  1. Agent Discovery: Agents enumerate available participants.
  2. Thread Initialization: The planner creates a thread, broadcasts the initial plan, and allocates subtasks.
  3. Task Execution and Monitoring: Worker agents execute subtasks, critique agent evaluates outputs, and all agents can propose refinements or alternative strategies.
  4. Consensus: Before submission, all agents vote on the candidate solution.
  5. Answer Submission: The answer-finding agent submits the validated result.

This protocol enables adaptive plan refinement, reduces reliance on the planner, and minimizes redundant context passing, leading to improved scalability and efficiency.

Experimental Evaluation

Baselines and Implementation

Anemoi is evaluated on the GAIA benchmark, which tests multi-step, real-world tasks requiring web search, document processing, and coding. The worker agents and toolkits are identical to those used in the OWL baseline, ensuring a controlled comparison. The planner agent uses GPT-4.1-mini, while worker agents use GPT-4o.

Main Results

Anemoi achieves an average accuracy of 52.73% on the GAIA validation set, outperforming the strongest open-source baseline OWL (43.63%) by +9.09 percentage points under identical LLM configurations. Notably, Anemoi with a weaker planner surpasses several proprietary and open-source frameworks that employ stronger LLMs, underscoring the efficacy of the A2A-based semi-centralized paradigm.

Comparative and Error Analysis

Task Attribution Analysis

A detailed comparison between Anemoi and OWL reveals that Anemoi solves 25 tasks that OWL fails, primarily due to collaborative refinement (52%) and reduced context redundancy (8%). Conversely, most tasks solved by OWL but not Anemoi are attributed to stochastic worker behavior and, to a lesser extent, communication latency. Figure 3

Figure 3: Comparison of task attribution categories between Anemoi and OWL. The donut chart illustrates the distribution of reasons why Anemoi succeeded where OWL failed, and vice versa.

Error Breakdown

Anemoi's remaining errors are predominantly due to LLM capability limitations (45.6%), toolkit constraints (20.6%), incorrect plans (11.8%), communication latency (10.3%), annotation errors (7.4%), and LLM hallucinations (4.4%). Figure 4

Figure 4: Remaining errors of the Anemoi.

Implications and Future Directions

The Anemoi architecture demonstrates that semi-centralized MAS with direct A2A communication can significantly improve performance, robustness, and scalability, especially when planner LLMs are resource-constrained. The reduction in token overhead and the ability for agents to collaboratively refine plans in real time are particularly advantageous for complex, multi-step tasks.

Theoretically, this work suggests that MAS architectures should move beyond rigid, centralized planning and embrace more flexible, communication-rich paradigms. Practically, the approach enables cost-effective deployment of MAS in environments where access to large LLMs is limited.

Future research directions include:

  • Enhancing agent autonomy and specialization.
  • Improving fault tolerance and recovery from communication failures.
  • Extending the protocol to support heterogeneous agent populations and human-in-the-loop scenarios.
  • Investigating the impact of more advanced consensus mechanisms and dynamic agent instantiation.

Conclusion

Anemoi introduces a semi-centralized MAS architecture that leverages structured A2A communication to overcome the limitations of context-engineering-based, centralized systems. The empirical results on the GAIA benchmark demonstrate substantial performance gains, particularly in settings with weaker planner LLMs. This work provides a concrete foundation for scalable, robust, and efficient MAS, and points toward a future where agent collaboration is both adaptive and communication-efficient.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube