Infinite-horizon extensions of reward-free warm-up and MAIL-WARM
Establish whether analogous sample complexity guarantees for interactive Multi-Agent Imitation Learning can be obtained in the infinite-horizon discounted setting, by developing reward-free exploration and analysis tools that do not rely on finite-horizon-specific algorithms such as EULER.
References
Whether analogous results can be obtained in the infinite-horizon regime remains an open challenge, and progress in this direction could be of interest independent of MAIL.
— Rate optimal learning of equilibria from data
(2510.09325 - Freihaut et al., 10 Oct 2025) in Conclusion and future directions