Multi-Agent Design: Optimizing Agents with Better Prompts and Topologies

This presentation explores the Multi-Agent System Search (MASS) framework, a novel approach to automatically designing high-performing multi-agent systems using large language models. MASS addresses the complexity of jointly optimizing agent prompts and interaction topologies through a structured three-stage methodology: local prompt optimization of individual agent blocks, influence-guided topology search, and global workflow-level refinement. Validated across reasoning, long-context understanding, and coding tasks, MASS achieves substantial performance gains of 10-13% over single-agent baselines and outperforms existing manual and automated design approaches, offering practical design principles for building more effective multi-agent systems.
Script
Designing effective multi-agent systems with large language models feels like searching for a needle in an exponentially growing haystack. The agents need perfect prompts, the right topology, and seamless collaboration, yet the design space explodes combinatorially with every new component you add.
Building on that complexity, the authors identify the core challenge: language models respond unpredictably to small prompt changes, while the space of possible agent workflows grows factorially. Traditional approaches either rely on costly manual tuning or brute-force search that quickly becomes intractable.
The researchers propose MASS, a structured solution that breaks this problem into manageable stages.
Following that transition, the first stage focuses on perfecting individual agent types before assembling them. The framework optimizes a baseline predictor, then systematically refines each specialized block like debate or reflection agents, measuring how much performance gain each contributes.
With optimized components in hand, the second stage intelligently searches for the best workflow structure. Rather than exhaustively testing every combination, MASS uses the influence scores to bias sampling toward promising architectures, efficiently navigating billions of potential configurations.
Building on the selected topology, the final stage performs holistic optimization across the complete workflow. This global tuning allows agent prompts to adapt specifically to their collaborative context, capturing interaction nuances that local optimization might miss.
Turning to empirical validation, the authors tested MASS across reasoning, long-context understanding, and coding tasks. Results show substantial improvements over single-agent approaches and manual designs, with gains holding consistently across different language model backbones and problem domains.
Digging into why this works, ablation studies reveal that each stage contributes essential value. Starting with high-quality components reduces error propagation, influence-based search avoids wasted computation, and global tuning captures emergent collaborative patterns that only appear in the final workflow.
From these findings, the researchers distill actionable guidance for building multi-agent systems. The framework demonstrates that systematic decomposition of the optimization problem, combined with empirical prioritization of design dimensions, vastly outperforms both manual engineering and naive automated search.
MASS shows us that the future of multi-agent design lies not in searching harder, but in searching smarter through structured, stage-wise refinement. Visit EmergentMind.com to explore how these principles can transform your own multi-agent systems.