Anemoi MAS Architecture
- Anemoi is a semi-centralized multi-agent system that leverages structured agent-to-agent communication via the Coral Protocol for dynamic task execution.
- It delegates initial plan generation to a planner while empowering specialized worker agents for real-time collaboration and iterative refinements.
- Empirical evaluations on the GAIA benchmark show a 9.09% improvement over baselines, highlighting enhanced accuracy, scalability, and reduced token overhead.
Anemoi denotes a semi-centralized multi-agent system (MAS) architecture predicated on structured agent-to-agent (A2A) communication, implemented via the MCP server from Coral Protocol. Contrasting with prevailing centralized paradigms that center all orchestration within a single planner agent, Anemoi delegates only initial plan generation to the planner and empowers specialized worker agents—such as web, document processing, and reasoning/coding agents—to engage in direct, real-time intercommunication. This configuration enables adaptive task execution, bottleneck identification, and iterative plan refinement, while reducing redundant prompt passing and planner dependency.
1. System Architecture and Centralization Paradigm
Anemoi diverges from classical context-engineering MAS architectures characterized by unidirectional prompt-based exchanges from a centralized planner toward worker agents. Instead, it incorporates a hybrid (semi-centralized) structure: the planner agent’s sole responsibility is to generate the initial plan; subsequent task execution, monitoring, and troubleshooting are delegated to dynamically interacting worker agents. All participating agents can monitor execution, assess partial results, propose corrections, and execute refinements continuously during the task's progression.
The architecture includes:
- Planner Agent: Produces an initial action plan and instantiates the multi-agent communication thread.
- Worker Agents: Specialized for domain-specific subtasks (e.g., web search, document parsing, code reasoning), these agents collaborate directly during execution.
- Critique Agents: Evaluate outputs from worker agents and facilitate collective plan refinement by providing structured feedback.
The system’s semi-centralized topology minimizes the planner’s operational burden, enabling increased robustness in scenarios involving less powerful planner models.
2. Structured Agent-to-Agent Communication via MCP
At the core of Anemoi’s design is the Coral Protocol’s Agent-to-Agent Communication MCP server, which provides a thread-based, structured environment for agent interactions. Key communication operations are as follows:
- Agent Discovery: Registration and discovery occur using
list_agents
, allowing agents to enumerate peers. - Thread Instantiation: The planner creates a thread for , where is the set of all agents.
- Message Exchange: Agents communicate via
\mathrm{send\_message}(\tau, m)
. Each waits for direct mentions using\mathrm{wait\_for\_mentions}(\tau, a_i)
to coordinate task transitions. - Dynamic Participation: Agents can be dynamically added or removed (via
add_participant
,remove_participant
) based on execution needs.
Task execution is formalized as follows:
- The worker agent acts on a subtask , producing result .
- A critique agent evaluates this result: .
This structured approach obviates brittle prompt concatenation and repeated context injection, supporting robust, low-redundancy coordination.
3. Empirical Performance and Benchmarking
Performance evaluation on the GAIA benchmark demonstrates notable accuracy improvements:
System | Planner | Worker Agents | GAIA “pass@3” Accuracy |
---|---|---|---|
Anemoi | GPT-4.1-mini | GPT-4o | 52.73% |
OWL (Baseline) | GPT-4.1-mini | GPT-4o | 43.63% |
With identical LLM settings (GPT-4.1-mini as planner, GPT-4o as workers), Anemoi surpasses the OWL baseline by +9.09 percentage points. Further breakdown by task level in the original dataset corroborates the generalizability of these gains. A salient result is that reduced dependency on a high-capacity planner does not impair system performance, due to orchestrated agent-to-agent communication and distributed oversight.
4. Scalability and Resource Efficiency
Anemoi’s architectural innovations yield enhanced scalability and cost-efficiency:
- Distributed Coordination: Worker agents engage in direct, context-aware exchanges, bypassing the need to repeatedly encode the entire context in prompts.
- Reduced Token Overhead: Only incremental updates and collaborative decisions are shared. This substantially lowers token count, reducing LLM inference costs, latency, and potential context overflow.
- Robustness to Planner Weakness: The system sustains accurate task execution even with small-parameter planners, as the burden of oversight and error correction is decentralized.
This design supports practical operation at scale in multi-step problem domains and high-connectivity multi-agent settings.
5. Implementation Framework and Integration
Anemoi provides a publicly available implementation at https://github.com/Coral-Protocol/Anemoi. Technical specifics:
- Coral Protocol’s A2A MCP Server: The system’s communication backbone, supporting robust, concurrent, and dynamic agent coordination.
- Agent Roles: Planner (initial orchestration), multiple specialized workers (execution), with agent roles and configurations matched to those in OWL for fair comparison.
- LLM Assignments: GPT-4.1-mini is assigned to the planner agent; GPT-4o instances (serving web, doc, code tasks) are deployed as worker agents.
- Benchmarking Configuration: All agents are configured as in the OWL baseline to isolate the effect of Anemoi’s architectural differences.
Researchers and practitioners can deploy, extend, or instrument the platform for further paper in distributed AI and MAS coordination tasks.
6. Context and Significance within Multi-Agent Systems
Anemoi’s semi-centralized framework and A2A communication challenge the prevailing orthodoxy of single-planner-controlled MAS. The system’s empirical gains on GAIA indicate that direct, structured agent communication both enhances flexibility and increases robustness against planner bottlenecks and prompt context redundancies. This suggests suitability for deployments where LLM resource constraints or task complexity preclude highly centralized planning architectures. As MAS research continues to seek scalable, adaptive coordination, Anemoi evidences the viability of semi-centralized, thread-based studies as a baseline for future distributed AI infrastructure assessments.