- The paper shows that LLM-guided evolutionary search significantly outperforms internal deliberation in achieving higher stability and productivity.
- Key findings reveal that external evolution embeds explicit costly punishment rules, enhancing enforcement but remaining brittle to incentive shifts.
- The paper demonstrates that internal deliberation offers superior adaptability by updating norms in dynamic environments, ensuring robust governance.
Deliberation Versus Evolution for Multi-Agent Constitutional Design
Problem Statement and Motivation
Multi-agent AI ecosystems increasingly require explicit behavioral constitutions to ensure societal stability and alignment, but the optimal process for discovering these constitutions remains unresolved. The central question is whether constitutions should be generated internally, through agent deliberation and self-governance, or externally, via optimization methods such as LLM-guided evolution. This contrast encapsulates a structural trade-off between responsiveness and enforcement capacity. While precedents such as Constitutional AI address single-agent settings with fixed, human-authored rules, multi-agent constructs have distinct challenges, including strategic interaction, endogenous norm emergence, and the nonexistence of a central legislator.
Experimental Design and Methodology
The work conducts a controlled comparison of internal deliberation (agents collectively proposing and voting on rules) and external evolutionary optimization (offline LLM-guided search for constitutions) across three social environments:
- Gridworld Coordination: Agents gather resources for team projects with options for conflict and theft.
- Iterated Public Goods Game (PGG): Agents allocate tokens between private wealth and a public pool, with costly punishment available.
- Bilateral Trading Game: Agents trade private endowments under asymmetric information.
All agents are instantiated as GPT-OSS-120B instances; each simulation comprises six agents split into two teams, and an overseer eliminates the lowest performer every ten turns. The evaluation metric is the Stability Score S=max(0,0.5P+0.3V−0.2C), synthesizing productivity, survival, and social conflict.
Three experimental arms are assessed: no constitution (baseline), internal deliberation, and an evolved external constitution. For the PGG, an ablation is conducted by varying the pool multiplier m (which adjusts the base incentive for cooperation) to test generalization of constitutions optimized at m=1.5.
Results and Empirical Findings
Superiority of External Evolution in Collective Action
External evolutionary search consistently outperforms agent deliberation and the baseline in both the gridworld and public goods settings. In gridworld, evolution yields S=0.458±0.017 versus deliberation's 0.319±0.091 (mean ± std), with highly significant pairwise differences (p<0.01). Productivity is the dominant contributor to this gap (evolved P=0.916, deliberation P=0.701), and evolved constitutions completely suppress conflict (C=0.000), an effect not replicated under deliberation.
In the PGG, evolution achieves S=0.472±0.004 versus deliberation's m0, again with high significance. Evolution-derived constitutions reliably encode costly peer punishment, a canonical enforcement mechanism for sustaining cooperation.
Robustness and Failure Modes
Ablation studies on the public goods multiplier reveal the brittleness of evolved constitutions: when m1 is reduced from m2 (where cooperation is optimal) to m3 (where cooperation destroys value), the evolved constitution continues to enforce maximal contribution and punishment, leading to negative returns and rendering it the worst-performing method. In contrast, deliberating agents adapt their rules to the changed environment, outperforming both the control and evolutionary constitutions at m4.
Null Results in Bilateral Trading
Neither deliberation nor evolution yields significant improvements over the baseline in trading. Constitutions discovered in this context are advisory, not prescriptive—reflecting the lack of dominant rules in bargaining games with private information. The principal performance constraint is the agents’ case-specific inference, not the regulatory structure.
Mechanistic Analysis: The Punishment Gap
A critical empirical finding is that, across 30 deliberation runs, agents never proposed costly punishment. Deliberation consistently yields governance frameworks that prescribe thresholds, redistribution, or aspirational norms but avoid direct sanctioning provisions. In contrast, evolved constitutions insert explicit behavioral directives—precise, executable rules (e.g., "punish all free-riders with one token")—effectuating deterrence as predicted by game theory. This structural asymmetry arises because self-governing collectives are reluctant to institutionalize coercive sanctions, while external search, unconstrained by agent self-interest, converges immediately to optimal enforcement devices.
Theoretical Implications and Future Directions
The comparison highlights a fundamental trade-off: external evolutionary search excels in static, well-characterized social dilemmas by reliably discovering maximal enforcement structures, but is brittle to changes in incentive landscapes. Internal deliberation, despite its inability to discover certain classes of effective rules (notably costly punishment), confers superior structural adaptability. Deliberation-derived norms are sensitive to environmental feedback and shift appropriately as underlying incentives change. Thus, external constitutions "win on peaks" but lack robustness, whereas internal mechanisms optimize for responsiveness rather than maximum possible group payoff. In bilateral negotiation settings lacking dominant cooperative solutions, constitutional interventions are largely ineffectual.
The authors suggest that hybrid approaches warrant further investigation—specifically, initializing with an externally evolved scaffold and permitting ongoing deliberative amendment to preserve enforcement while accommodating environmental drift.
Conclusion
This work systematically compares internal agent deliberation and external LLM-guided evolutionary search for multi-agent constitutional design. External evolution reliably outperforms deliberation in collective-action dilemmas via the discovery of enforcement mechanisms such as costly punishment, but these externally imposed rules can become deleterious under incentive regime shifts. Internal deliberation, while unable to autonomously legislate direct enforcement, yields constitutions that adapt to current circumstances. For environments where social dilemma structure is dominant and stable, external optimization is preferable; for environments requiring ongoing adaptation, deliberative self-governance is more robust. No constitutional mechanism confers benefit in bilateral bargaining without exogenous enforcement leverage. The results motivate further synthesis: hybrid architectures that leverage the enforcement strength of evolution and the structural adaptability of deliberative amendment.