Papers
Topics
Authors
Recent
2000 character limit reached

Multi-Agent Collaboration Network (MacNet)

Updated 13 November 2025
  • MacNet is a graph-based paradigm that coordinates reasoning among LLM agents via directed acyclic graph structures.
  • It orchestrates agent interactions in a topologically ordered, resource-efficient manner, ensuring scalable and refined solution propagation.
  • It demonstrates a collaborative scaling law where optimal collective performance emerges with 16–32 agents and adapts through dynamic graph learning.

A Multi-Agent Collaboration Network (MacNet) is a graph-based paradigm for orchestrating distributed, interactive reasoning among autonomous agents—typically instantiated as LLMs—in a manner structurally and functionally analogous to neural networks. MacNets leverage explicit graph topologies (frequently directed acyclic graphs, DAGs) to coordinate communication, reflection, and refinement among agents, supporting scalable, emergent collective intelligence with predictable scaling behavior. Designed to efficiently propagate concise, refined solutions through topologically ordered interactions, this architecture enables both (i) resource-efficient collaboration among hundreds to thousands of agents and (ii) the systematic emergence of qualitatively new reasoning capabilities well before the parameter thresholds typical in monolithic neural scaling.

1. Formal Definition and Motivation

A MacNet is classically formalized as a DAG, G=(V,E)G=(V, E), with V={vi}V = \{v_i\} nodes and EV×VE \subset V \times V directed edges. Each node viv_i is assigned an assistant agent ai=ρ(vi)a_i=\rho(v_i) and each edge (vivj)(v_i \rightarrow v_j) is assigned an instructor agent aij=ρ(e)a_{ij}=\rho(e), where ρ()\rho(\cdot) denotes an agentization procedure wrapping an LLM backbone (e.g., GPT-3.5-turbo, GPT-4) with a role-specific prompt, optional tool-use bindings, and a short-term memory buffer (Qian et al., 11 Jun 2024).

This construction is directly motivated by the neural scaling law, which reveals that system-level capabilities in deep learning appear abruptly after exceeding critical parameter, data, or compute thresholds. MacNet posits an analogous question: can the repeated, structured addition of collaborative LLM agents produce “collaborative emergence” in task performance, akin to emergent phenomena in neural scaling but with far fewer agent-units (Hu et al., 22 Oct 2024)?

2. Topological Orchestration and Interaction Protocol

MacNet execution adheres to a strict topological order determined by the DAG, namely, for every directed edge (vivj)(v_i \rightarrow v_j):

  • Assistant aia_i generates or refines a solution T{T}.
  • Instructor aija_{ij} critiques and offers suggestions F{F}.
  • Assistant aia_i produces a refined version T{T'}.
  • Instructor aija_{ij} issues a prompt to assistant aja_j, who generates response V{V}.

The latest refined artifact S[vj]{S[v_j]} is the only solution propagated downstream; previous dialogue history is pruned, alleviating LLM context window bottlenecks and ensuring scalability to large agent populations.

Each edge’s local memory buffers multi-turn instruction–response exchanges, typically limited to three turns before memory clearance. Convergent nodes (with multiple in-edges) resolve kk upstream solutions by soliciting the assistant at that node to synthesize and critique, implementing a form of non-linear aggregation.

The general MacNet orchestration pseudocode is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Input: DAG G=(V,E), initial prompt P0 at source nodes
1. topo_order = TopologicalSort(agents={a_i}  {a_{ij}})
2. solutions S[v] =  for all v

3. for X in topo_order:
    if X is (a_i, a_{ij}):
        T = a_i.generate(S[v_i] or P0)
        F = a_{ij}.instruct(T)
        T_prime = a_i.refine(F)
        S[e] = T_prime
        
    if X is (a_{ij}, a_j):
        U = a_{ij}.generate(S[e])
        V = a_j.respond(U)
        S[v_j] = V

4. Final solution(s) reside at sink or convergent nodes

Scaling up, interactions are batched, and messages are pruned aggressively to maintain O(1)O(1) per-agent context.

3. Collaborative Scaling Law and Emergence

Empirical studies establish a “collaborative scaling law” for MacNet: as the agent count nn increases, normalized solution quality Q(n)Q(n) exhibits sigmoidal (logistic) growth,

Q(n)α1+eβ(nγ)+δ,Q(n) \approx \frac{\alpha}{1 + e^{-\beta(n - \gamma)}} + \delta,

where α\alpha (amplitude), β\beta (growth rate), γ\gamma (inflection), and δ\delta (shift) are fit parameters; for instance, γ18.2\gamma \approx 18.2 was reported for a representative mesh topology (Qian et al., 11 Jun 2024).

Crucially, collaborative emergence—i.e., the appearance of qualitatively superior, collective intelligence—arises for n16n \approx 16–$32$ agents (orders of magnitude smaller than corresponding neural scale thresholds, typically 1018+10^{18+} parameters [Kaplan et al., 2020]).

Beyond n32n \sim 32, further agent addition can induce slight quality degradation (2–6%) due to meta-context drift, paralleling oversaturation effects in logistic curves.

4. Graph Topologies and the Small-World Effect

Comparative experiments across topological families demonstrate that irregular (random/Erdős–Rényi) graphs—characterized by small-world properties, reduced path length, and clustering—outperform both highly structured meshes and sparse chains. For example, “MacNet-Random” yielded a 1–3% absolute computational quality gain over MacNet-Mesh (Qian et al., 11 Jun 2024).

Reverse topology tests, such as flipping star-shaped graphs to enforce premature convergence, degrade performance by 4–6%. This underscores that rapid information divergence (distribution of parallel reasoning) to specialists is preferable to early convergence.

5. Applications, Empirical Performance, and Comparative Evaluation

MacNet and its extensions have been empirically validated on a range of reasoning and generation tasks:

  • MMLU (multiple-choice reasoning): accuracy
  • HumanEval (code generation): pass@1
  • SRDD (repository-level software development): quality score in [0,1][0,1]
  • CommonGen-Hard (commonsense generation): composite score in [0,1][0,1]

A direct comparison (with GPT-3.5-turbo, 50-node MacNet) yields the following representative mean scores:

Method Composite Score
CoT 0.576
AutoGPT 0.566
GPTSwarm 0.516
AgentVerse 0.581
MacNet-Chain 0.608
MacNet-Star 0.627
MacNet-Tree 0.602
MacNet-Mesh 0.632
MacNet-Layered 0.563
MacNet-Random 0.652

MacNet-Random provided a 7% absolute improvement over AgentVerse and a 13% margin over single-agent CoT (Qian et al., 11 Jun 2024).

6. Adaptive and Self-Evolving MacNets

Standard MacNets operate with fixed, human-devised graph topology. Adaptive MacNets, exemplified by “Unrolled Graph Learning for Multi-Agent Collaboration” (Zhang et al., 2022), dynamically infer the collaboration adjacency matrix ARN×NA \in \mathbb{R}^{N \times N} via attention-weighted, per-coordinate similarity:

Di=[(θi,mθj,m)2]m,jD_i = [ (\theta_{i,m} - \theta_{j,m})^2 ]_{m, j}

minai12QDiai22s.t. ai1=1,aii=0,aij0\min_{a_i} \frac12 \| Q D_i a_i \|_2^2 \quad \text{s.t.}~ \| a_i \|_1 = 1, a_{ii} = 0, a_{ij} \ge 0

Graph weights are updated by unrolled (truncated) proximal gradient steps, feed-forwarded as a learned neural module. Agents alternate between updating their outgoing edges (collaborator selection) and fusing models from neighbors via convex combination.

Self-evolving MacNets, as in EvoMAC (Hu et al., 22 Oct 2024), introduce closed-loop test-time optimization: after each feed-forward pass producing candidate artifacts (e.g., code), an independent testing network generates unit tests, and textual “gradient” agents analyze logs for error localization and workflow/agent-prompt rewrites. This realizes “textual backpropagation,” allowing dynamic agent addition/removal and prompt rewriting to minimize observed failures.

Empirically, EvoMAC achieves substantial improvements over prior methods on both function-level and software-level benchmarks (e.g., rSDE-Bench, HumanEval). On rSDE-Bench, EvoMAC yielded up to +34.78 percentage points improvement over the strongest single-agent baseline.

7. Practical Considerations, Limitations, and Future Directions

MacNets exhibit key practical advantages: context-efficient memory usage (only the latest solution per edge), fine-grained orchestration via topological sorting, and empirically predictable scaling behavior. However, several constraints and phenomena must be acknowledged:

  • Reverse Degradation: Exceeding optimal network size (beyond n32n\sim32 in typical settings) can induce performance drops due to context drift or excessive splitting of meta-context.
  • Workflow Design: Topology must match task—irregular, small-world graphs often outperform regular structures; premature convergence is detrimental.
  • Adaptivity: Learned adaptive graphs and “textual backpropagation” provide autonomy, but global convergence and optimality are not guaranteed.
  • Application Limits: Heavy reliance on the quality of auxiliary components (e.g., unit test generators) in self-evolving settings, and system performance may be bottlenecked by LLM latency/cost at scale.
  • Resource Trade-offs: While agent context scales O(1)O(1), total system throughput and communication may still tax distributed LLM backends.

A plausible implication is that MacNet frameworks offer a scalable path to orchestrating collective reasoning among LLMs, enabling resource-efficient emergence of complex capabilities. Future research may involve reward-model-driven updates, meta-prompt learning for gradient/update agents, and extension to domains beyond code (e.g., document synthesis with automated validators).

References

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Multi-Agent Collaboration Network (MacNet).