Systematic identification of error-prone components in MAS execution graphs

Develop automated, systematic methods to identify error-prone nodes and edges within directed execution graphs of multi-agent systems composed of large language model agents and external tools, where nodes represent agents or tools and edges represent inter-agent message dependencies, in order to enable robust critical-path diagnosis and targeted augmentation across diverse agentic workflows.

Background

The paper models agentic workflows as directed execution graphs where nodes are LLM agents or tools and edges are dependency messages. It observes that multi-agent systems often suffer from node-, edge-, and path-level defects that limit accuracy and increase cost, and that diagnosing these defects across varied applications is challenging. Verifying every intermediate message is expensive and current frameworks mostly rely on public APIs, limiting low-level optimization options.

While a strawman approach is to upgrade all agents to stronger models, this is cost-ineffective. The authors argue that selectively improving the most critical agents requires a principled way to identify which nodes and edges are error-prone. They propose a confidence-guided probing mechanism as an initial step, but explicitly note that the broader challenge of systematic identification remains open-ended.

References

However, this is nontrivial, as it requires systematically identifying error-prone nodes and edges in the execution graph—a challenge that remains open-ended.

— Single-agent or Multi-agent Systems? Why Not Both? (2505.18286 - Gao et al., 23 May 2025) in Section 4.1 (Augmenting MAS Critical Path)

Systematic identification of error-prone components in MAS execution graphs

Background

References

Related Problems