Potential self‑reinforcing bias from using the same model for orchestration and evaluation
Ascertain whether employing the same language model for both meta‑orchestration and LLM‑as‑a‑judge evaluation in Mimosa introduces self‑reinforcing optimization tendencies, compared to cross‑model configurations.
References
While these roles operate on distinct outputs — the meta-orchestrator proposes a workflow structure, while the judge evaluates the resulting agent execution trace — it remains an open question whether using the same model introduces self-reinforcing optimization tendencies.
— Mimosa Framework: Toward Evolving Multi-Agent Systems for Scientific Research
(2603.28986 - Legrand et al., 30 Mar 2026) in Section 7 (Limitations and Future work), bullet 'Cross-model configuration for meta-orchestrator and judge'