Rate-optimal interactive MAIL with O(ε^{-2}) expert queries

Develop an interactive Multi-Agent Imitation Learning algorithm for finite-horizon two-player zero-sum Markov games that outputs an ε-approximate Nash equilibrium using the optimal number of expert queries of order O(ε^{-2}).

Background

Existing interactive MAIL algorithms, such as MURMAIL, require a number of expert queries scaling as O(ε^{-8}), which is suboptimal compared to information-theoretic lower bounds that suggest O(ε^{-2}) dependence.

This question asks for an algorithmic design that matches the optimal ε-dependence in query complexity while learning an ε-approximate Nash equilibrium, thereby closing the gap between known upper and lower bounds in the interactive setting.

References

Open Question 2 Can we design an algorithm which outputs an $\varepsilon$-approximate Nash equilibrium with the optimal order of expert queries, which is $\mathcal{O}(\varepsilon^{-2})$?

— Rate optimal learning of equilibria from data (2510.09325 - Freihaut et al., 10 Oct 2025) in Introduction (Open Question 2)

Rate-optimal interactive MAIL with O(ε^{-2}) expert queries

Sponsor

Background

References

Related Problems