On the Variational Interpretation of Mirror Play in Monotone Games (2403.15636v1)
Abstract: Mirror play (MP) is a well-accepted primal-dual multi-agent learning algorithm where all agents simultaneously implement mirror descent in a distributed fashion. The advantage of MP over vanilla gradient play lies in its usage of mirror maps that better exploit the geometry of decision domains. Despite extensive literature dedicated to the asymptotic convergence of MP to equilibrium, the understanding of the finite-time behavior of MP before reaching equilibrium is still rudimentary. To facilitate the study of MP's non-equilibrium performance, this work establishes an equivalence between MP's finite-time primal-dual path (mirror path) in monotone games and the closed-loop Nash equilibrium path of a finite-horizon differential game, referred to as mirror differential game (MDG). Our construction of MDG rests on the Brezis-Ekeland variational principle, and the stage cost functional for MDG is Fenchel coupling between MP's iterates and associated gradient updates. The variational interpretation of mirror path in static games as the equilibrium path in MDG holds in deterministic and stochastic cases. Such a variational interpretation translates the non-equilibrium studies of learning dynamics into a more tractable equilibrium analysis of dynamic games, as demonstrated in a case study on the Cournot game, where MP dynamics corresponds to a linear quadratic game.
- P. Mertikopoulos and Z. Zhou, “Learning in games with continuous action sets and unknown payoff functions,” Mathematical Programming, vol. 173, pp. 465–507, 2019.
- A. S. Nemirovskij and D. B. Yudin, “Problem complexity and method efficiency in optimization,” 1983.
- S. Shalev-shwartz and Y. Singer, “Convex Repeated Games and Fenchel Duality,” in Advances in Neural Information Processing Systems (B. Schölkopf, J. Platt, and T. Hoffman, eds.), vol. 19, MIT Press, 2006.
- T. Li, G. Peng, Q. Zhu, and T. Baar, “The Confluence of Networks, Games, and Learning a Game-Theoretic Framework for Multiagent Decision Making Over Networks,” IEEE Control Systems, vol. 42, no. 4, pp. 35–67, 2022.
- Y. Pan, T. Li, and Q. Zhu, “On the resilience of traffic networks under non-equilibrium learning,” in 2023 American Control Conference (ACC), pp. 3484–3489, IEEE, 2023.
- Y. Lei and K. Tang, “Stochastic Composite Mirror Descent: Optimal Bounds with High Probabilities,” in Advances in Neural Information Processing Systems, vol. 31, Curran Associates, Inc., 2018.
- Y. Pan, T. Li, and Q. Zhu, “Is stochastic mirror descent vulnerable to adversarial delay attacks? a traffic assignment resilience study,” in 2023 62nd IEEE Conference on Decision and Control (CDC), pp. 8328–8333, IEEE, 2023.
- H. Brézis and I. Ekeland, “Un principe variationnel associéa certaines equations paraboliques. le cas independant du temps,” CR Acad. Sci. Paris Sér. A, vol. 282, pp. 971–974, 1976.
- B. Tzen, A. Raj, M. Raginsky, and F. Bach, “Variational principles for mirror descent and mirror langevin dynamics,” IEEE Control Systems Letters, 2023.
- Z. Zhou, P. Mertikopoulos, A. L. Moustakas, N. Bambos, and P. Glynn, “Mirror descent learning in continuous games,” in 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pp. 5776–5783, IEEE, 2017.
- P. Mertikopoulos and W. H. Sandholm, “Learning in games via reinforcement and regularization,” 2016.
- B. Gao and L. Pavel, “Continuous-time discounted mirror descent dynamics in monotone concave games,” IEEE Transactions on Automatic Control, vol. 66, no. 11, pp. 5451–5458, 2020.
- S. Liu, T. Li, and Q. Zhu, “Game-Theoretic Distributed Empirical Risk Minimization With Strategic Network Design,” IEEE Transactions on Signal and Information Processing over Networks, vol. 9, pp. 542–556, 2023.
- B. Gao and L. Pavel, “Continuous-time convergence rates in potential and monotone games,” SIAM Journal on Control and Optimization, vol. 60, no. 3, pp. 1712–1731, 2022.
- T. Başar and G. J. Olsder, Dynamic noncooperative game theory. SIAM, 1998.
- F. Facchinei and J.-S. Pang, Finite-dimensional variational inequalities and complementarity problems. Springer, 2003.
- Springer, 2010.
- P. Mertikopoulos and M. Staudigl, “On the convergence of gradient-like flows with noisy gradient input,” 2017.
- T. Li, Y. Zhao, and Q. Zhu, “The role of information structures in game-theoretic multi-agent learning,” Annual Reviews in Control, vol. 53, pp. 296–314, 2022.
- W. Krichene, S. Krichene, and A. Bayen, “Convergence of mirror descent dynamics in the routing game,” in 2015 European Control Conference (ECC), pp. 569–574, IEEE, 2015.
- C. Daskalakis, A. Ilyas, V. Syrgkanis, and H. Zeng, “Training gans with optimism,” arXiv preprint arXiv:1711.00141, 2017.
- F. Alvarez, J. Bolte, and O. Brahic, “Hessian riemannian gradient flows in convex programming,” SIAM journal on control and optimization, vol. 43, no. 2, pp. 477–501, 2004.