Rate-optimal interactive MAIL with O(ε^{-2}) expert queries
Develop an interactive Multi-Agent Imitation Learning algorithm for finite-horizon two-player zero-sum Markov games that outputs an ε-approximate Nash equilibrium using the optimal number of expert queries of order O(ε^{-2}).
References
Open Question 2 Can we design an algorithm which outputs an $\varepsilon$-approximate Nash equilibrium with the optimal order of expert queries, which is $\mathcal{O}(\varepsilon{-2})$?
— Rate optimal learning of equilibria from data
(2510.09325 - Freihaut et al., 10 Oct 2025) in Introduction (Open Question 2)