Maximum Causal Entropy Inverse Reinforcement Learning for Mean-Field Games (2401.06566v1)
Abstract: In this paper, we introduce the maximum casual entropy Inverse Reinforcement Learning (IRL) problem for discrete-time mean-field games (MFGs) under an infinite-horizon discounted-reward optimality criterion. The state space of a typical agent is finite. Our approach begins with a comprehensive review of the maximum entropy IRL problem concerning deterministic and stochastic Markov decision processes (MDPs) in both finite and infinite-horizon scenarios. Subsequently, we formulate the maximum casual entropy IRL problem for MFGs - a non-convex optimization problem with respect to policies. Leveraging the linear programming formulation of MDPs, we restructure this IRL problem into a convex optimization problem and establish a gradient descent algorithm to compute the optimal solution with a rate of convergence. Finally, we present a new algorithm by formulating the MFG problem as a generalized Nash equilibrium problem (GNEP), which is capable of computing the mean-field equilibrium (MFE) for the forward RL problem. This method is employed to produce data for a numerical example. We note that this novel algorithm is also applicable to general MFE computations.
- Apprenticeship learning via inverse reinforcement learning. ICML ’04, 2004.
- Equilibria of dynamic games with many players: Existence, approximation, and market structure. Journal of Economic Theory, 156:269–316, 2015.
- Individual-level inverse reinforcement learning for mean field games. In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, AAMAS ’22, pages 253–262, 2022.
- Adversarial inverse reinforcement learning for mean field games. AAMAS ’23, pages 1088–1096, 2023.
- On the solution of the KKT conditions of generalized nash equilibrium problems. SIAM Journal on Optimization, 21(3):1082–1108, 2011.
- Discrete time mean-field stochastic linear-quadratic optimal control problems. Automatica, 49:3222–3233, 2013.
- F. Facchinei and Christian Kanzow. Generalized Nash equilibrium problems. Annals of Operations Research, 175:177–211, 02 2010.
- Finite-dimensional variational inequalities and complementarity problems. Springer, 2003.
- Learning robust rewards with adverserial inverse reinforcement learning. In International Conference on Learning Representations, 2018.
- Handbook of convergence theorems for (stochastic) gradient methods, 2023.
- A primer on maximum causal entropy inverse reinforcement learning, 2022.
- Discrete time, finite state space mean field games. J. Math. Pures Appl., 93:308–328, 2010.
- O. Hernandez-Lerma and J. Gonzalez-Hernandez. Constrained Markov control processes in Borel spaces:the discounted case. Math. Meth. Oper. Res., 52:271–285, 2000.
- O. Hernández-Lerma and J.B. Lasserre. Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer, 1996.
- Large population stochastic dynamic games: Closed loop McKean-Vlasov systems and the Nash certainty equivalence principle. Communications in Information Systems, 6:221–252, 2006.
- Binary mean field stochastic games: Stationary equilibria and comparative statics. Modeling, Stochastic Control, Optimization, and Applications, 2019.
- Mean field equilibrium: Uniqueness, existence, and comparative statics. Operations Research, 70(1):585–605, 2022.
- Renato D. C. Monteiro and Jong-Shi Pang. A potential reduction Newton method for constrained equations. SIAM Journal on Optimization, 9(3):729–754, 1999.
- J. Moon and T. Başar. Discrete-time decentralized control using the risk-sensitive performance criterion in the large population regime: a mean field approach. In ACC 2015, Chicago, Jul. 2015.
- J. Moon and T. Başar. Discrete-time mean field Stackelberg games with a large number of followers. In CDC 2016, Las Vegas, Dec. 2016.
- A unified view of entropy-regularized Markov decision processes. arXiv:1705.07798, 2017.
- Algorithms for inverse reinforcement learning. In Proceedings of the Seventeenth International Conference on Machine Learning, ICML ’00, pages 663–670, 2000. ISBN 1558607072.
- M. Nourian and G.N. Nair. Linear-quadratic-Gaussian mean field games under high rate quantization. In CDC 2013, Florence, Dec. 2013.
- Maximum margin planning. In Proceedings of the 23rd International Conference on Machine Learning, ICML ’06, pages 729–736, 2006.
- Naci Saldi. Linear mean-field games with discounted cost. arXiv:2301.06074, 2023.
- Maurice Sion. On general minimax theorems. Pacific Journal of Mathematics, 8:171–176, 1958.
- Revisiting maximum entropy inverse reinforcement learning: New perspectives and algorithms. pages 241–249, 2020.
- Reinforcement learning in stationary mean-field games. pages 251–259. International Foundation for Autonomous Agents and Multiagent Systems, 2019.
- Oblivious equilibrium: A mean field approximation for large-scale dynamic games. In Advances in Neural Information Processing System, Jan. 2005.
- Computational methods for oblivious equilibrium. Operations Research, 58(4-part-2):1247–1265, 2010.
- Learning deep mean field games for modelling large population behaviour. arXiv:1711.03156, 2018.
- Infinite time horizon maximum causal entropy inverse reinforcement learning. IEEE Transactions on Automatic Control, 63(9):2787–2802, 2018.
- Maximum entropy inverse reinforcement learning. In Proc. AAAI, pages 1433–1438, 2008.
- Modeling interaction via the principle of maximum causal entropy. In Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML’10, pages 1255–1262, 2010.
- The principle of maximum causal entropy for estimating interacting processes. IEEE Transactions on Information Theory, 59(4):1966–1980, 2013.