Optimal Model Composition and Debate Protocols for Large Agent Swarms

Determine the optimal composition of heterogeneous language models and the debate protocol parameters for scaling the Kitchen Loop’s Discussion Manager to larger, heterogeneous agent swarms that minimize sycophancy, disagreement collapse, and negative agreement, and assess whether previously observed sycophancy thresholds generalize beyond small three-model debates.

Background

The Discussion Manager implements structured multi-model debate (Gemini, Codex/GPT, Claude) with centralized moderation and safeguards against sycophancy. Empirical evaluation over 23 discussions surfaced strengths and weaknesses but at small scale (three debaters).

Scaling to larger, heterogeneous swarms introduces open design choices in model selection and protocol design. The generalization of observed sycophancy thresholds and convergence properties to larger collectives is not yet established.

References

OP4: Sycophancy at Scale. The Discussion Manager mitigates sycophancy in 3-model debates (Section 7), but optimal model composition and debate protocols for larger, heterogeneous agent swarms are unknown. Our corpus of 23 discussions is too small to establish whether the observed SS < 20 threshold generalizes (cf.~Yao et al., 2025 ).

— The Kitchen Loop: User-Spec-Driven Development for a Self-Evolving Codebase (2603.25697 - Roy, 26 Mar 2026) in Subsection "Open Problems" (Production Safety Record)

Optimal Model Composition and Debate Protocols for Large Agent Swarms

Background

References

Related Problems