Propagation of Template Collapse in Multi‑Agent Reinforcement Learning

Determine how template collapse—defined as the degradation where reasoning becomes input‑agnostic across inputs while within‑input diversity remains high—propagates in multi‑agent reinforcement learning settings, identifying the mechanisms and dynamics by which this phenomenon emerges and spreads among interacting agents.

Background

The paper defines template collapse as a failure mode in which LLM agents produce reasoning that appears diverse within a single input but is largely input‑agnostic across different inputs, corresponding to low mutual information between inputs and reasoning while conditional entropy remains high. It provides diagnostics based on mutual information proxies and explains the mechanism via a signal‑to‑noise ratio analysis in single‑agent settings.

All experiments and analyses in the paper are conducted in single‑agent reinforcement learning. The authors explicitly note that extending this understanding to multi‑agent reinforcement learning is not resolved, raising the question of how interactions between agents might cause, amplify, or mitigate the spread of template‑like, input‑agnostic reasoning patterns.

References

All experiments are single-agent; how template collapse propagates in multi-agent RL remains open.

RAGEN-2: Reasoning Collapse in Agentic RL  (2604.06268 - Wang et al., 7 Apr 2026) in Conclusions and Limitations