Quantitative metrics and principled selection for optimal specialization trade-off
Develop quantitative metrics to characterize when the number of experts n and the top-K selection in Mixture-of-Experts models should be considered "large" or "small", and determine a principled method to identify the optimal trade-off between expert specialization and collaboration across different model configurations.
Sponsor
References
However, we currently lack quantitative metrics to characterize "large" or "small" n and K across different models; as a result, determining the optimal trade-off remains largely empirical.
— Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
(2512.23447 - Lv et al., 29 Dec 2025) in Section 4.3 (The ERC loss is an effective tool for exploring expert specialization) — The optimal specialization degree