Polynomial Convergence of Bandit No-Regret Dynamics in Congestion Games (2401.09628v1)
Abstract: We introduce an online learning algorithm in the bandit feedback model that, once adopted by all agents of a congestion game, results in game-dynamics that converge to an $\epsilon$-approximate Nash Equilibrium in a polynomial number of rounds with respect to $1/\epsilon$, the number of players and the number of available resources. The proposed algorithm also guarantees sublinear regret to any agent adopting it. As a result, our work answers an open question from arXiv:2206.01880 and extends the recent results of arXiv:2306.15543 to the bandit feedback model. We additionally establish that our online learning algorithm can be implemented in polynomial time for the important special case of Network Congestion Games on Directed Acyclic Graphs (DAG) by constructing an exact $1$-barycentric spanner for DAGs.
- Jacob D Abernethy, Elad Hazan and Alexander Rakhlin “Competing in the dark: An efficient algorithm for bandit linear optimization”, 2009
- “Near-optimal no-regret learning for correlated equilibria in multi-player general-sum games” In STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 - 24, 2022 ACM, 2022, pp. 736–749 DOI: 10.1145/3519935.3520031
- “On Last-Iterate Convergence Beyond Zero-Sum Games” In International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA 162, Proceedings of Machine Learning Research PMLR, 2022, pp. 536–581
- “Uncoupled Learning Dynamics with O(log T) Swap Regret in Multiplayer Games” In NeurIPS, 2022 URL: http://papers.nips.cc/paper%5C˙files/paper/2022/hash/15d45097f9806983f0629a77e93ee60f-Abstract-Conference.html
- Haris Angelidakis, Dimitris Fotakis and Thanasis Lianeas “Stochastic Congestion Games with Risk-Averse Players” In Algorithmic Game Theory - 6th International Symposium, SAGT 2013, Aachen, Germany, October 21-23, 2013. Proceedings 8146, Lecture Notes in Computer Science Springer, 2013, pp. 86–97
- Sanjeev Arora, Elad Hazan and Satyen Kale “The Multiplicative Weights Update Method: a Meta-Algorithm and Applications” In Theory Comput. 8.1, 2012, pp. 121–164
- “Minimax Policies for Adversarial and Stochastic Bandits” In COLT 2009 - The 22nd Conference on Learning Theory, 2009
- Jean-Yves Audibert, Sébastien Bubeck and Gábor Lugosi “Regret in Online Combinatorial Optimization” In Math. Oper. Res. 39.1 Linthicum, MD, USA: INFORMS, 2014, pp. 31–45
- “The Nonstochastic Multiarmed Bandit Problem” In SIAM J. Comput. 32.1, 2002, pp. 48–77
- Baruch Awerbuch and Robert D Kleinberg “Adaptive routing with end-to-end feedback: Distributed learning and geometric approaches” In Proceedings of the thirty-sixth annual ACM symposium on Theory of computing, 2004, pp. 45–53
- Baruch Awerbuch and Robert D. Kleinberg “Adaptive Routing with End-to-End Feedback: Distributed Learning and Geometric Approaches” In Proceedings of the Thirty-Sixth Annual ACM Symposium on Theory of Computing, STOC ’04, 2004, pp. 45–53
- “Communication complexity of Nash equilibrium in potential games (extended abstract)” In 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Durham, NC, USA, November 16-19, 2020 IEEE, 2020, pp. 1439–1445
- “An efficient high-probability algorithm for Linear Bandits” arXiv:1610.02072 [cs] arXiv, 2016 DOI: 10.48550/arXiv.1610.02072
- Sébastien Bubeck, Nicolò Cesa-Bianchi and Sham M. Kakade “Towards Minimax Policies for Online Linear Optimization with Bandit Feedback” In COLT 2012 - The 25th Annual Conference on Learning Theory, June 25-27, 2012, Edinburgh, Scotland 23, JMLR Proceedings JMLR.org, 2012, pp. 41.1–41.14
- “On Approximate Pure Nash Equilibria in Weighted Congestion Games with Polynomial Latencies” In 46th International Colloquium on Automata, Languages, and Programming, ICALP 2019, July 9-12, 2019, Patras, Greece 132, LIPIcs Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2019, pp. 133:1–133:12
- “Computing Better Approximate Pure Nash Equilibria in Cut Games via Semidefinite Programming” In Proceedings of the 55th Annual ACM Symposium on Theory of Computing, STOC 2023, Orlando, FL, USA, June 20-23, 2023 ACM, 2023, pp. 710–722
- “Approximate pure nash equilibria in weighted congestion games: existence, efficient computation, and structure” In Proceedings of the 13th ACM Conference on Electronic Commerce, EC 2012, Valencia, Spain, June 4-8, 2012 ACM, 2012, pp. 284–301
- “Efficient Computation of Approximate Pure Nash Equilibria in Congestion Games” In IEEE 52nd Annual Symposium on Foundations of Computer Science, FOCS 2011, Palm Springs, CA, USA, October 22-25, 2011 IEEE Computer Society, 2011, pp. 532–541
- Constantin Carathéodory “Über den Variabilitätsbereich der Koeffizienten von Potenzreihen, die gegebene Werte nicht annehmen” In Mathematische Annalen 64.1 Springer, 1907, pp. 95–115
- “Combinatorial bandits” In Journal of Computer and System Sciences 78.5 Elsevier, 2012, pp. 1404–1422
- “Combinatorial bandits” In J. Comput. Syst. Sci. 78.5, 2012, pp. 1404–1422
- “Generalized mirror descents in congestion games” In Artificial Intelligence 241 Elsevier, 2016, pp. 217–243
- Liyu Chen, Haipeng Luo and Chen-Yu Wei “Impossible tuning made possible: A new expert algorithm and its applications” In Conference on Learning Theory, 2021, pp. 1216–1259 PMLR
- “Convergence to approximate Nash equilibria in congestion games” In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2007, New Orleans, Louisiana, USA, January 7-9, 2007 SIAM, 2007, pp. 169–178
- “The Price of Anarchy of Finite Congestion Games” In STOC, 2005, pp. 67–73
- “Existence and Complexity of Approximate Equilibria in Weighted Congestion Games” In Math. Oper. Res. 48.1, 2023, pp. 583–602
- Johanne Cohen, Amélie Héliou and Panayotis Mertikopoulos “Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence” In Algorithmic Game Theory - 10th International Symposium, SAGT 2017, L’Aquila, Italy, September 12-14, 2017, Proceedings 10504, Lecture Notes in Computer Science Springer, 2017, pp. 252–263
- Patrick L Combettes and Jean-Christophe Pesquet “Proximal splitting methods in signal processing” In Fixed-point algorithms for inverse problems in science and engineering Springer, 2011, pp. 185–212
- “Learning in Congestion Games with Bandit Feedback” NeurIPS, 2022
- Varsha Dani, Thomas P. Hayes and Sham M. Kakade “The Price of Bandit Information for Online Optimization” In Proceedings of the 20th International Conference on Neural Information Processing Systems, NIPS’07 Vancouver, British Columbia, Canada: Curran Associates Inc., 2007, pp. 345–352
- Varsha Dani, Sham M Kakade and Thomas Hayes “The price of bandit information for online optimization” In Advances in Neural Information Processing Systems 20, 2007
- Constantinos Daskalakis, Maxwell Fishelson and Noah Golowich “Near-Optimal No-Regret Learning in General Games” In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, 2021, pp. 27604–27616
- “Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence” In International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA 162, Proceedings of Machine Learning Research PMLR, 2022, pp. 5166–5220
- Eyal Even-Dar, Yishay Mansour and Uri Nadav “On the convergence of regret minimization dynamics in concave games” In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 - June 2, 2009 ACM, 2009, pp. 523–532
- A. Fabrikant, C. Papadimitriou and K. Talwar “The complexity of pure Nash equilibria” In ACM Symposium on Theory of Computing (STOC), 2004, pp. 604–612 ACM
- “Near-Optimal No-Regret Learning Dynamics for General Convex Games” In NeurIPS, 2022
- Abraham D Flaxman, Adam Tauman Kalai and H Brendan McMahan “Online convex optimization in the bandit setting: gradient descent without a gradient” In arXiv preprint cs/0408007, 2004
- Dimitris Fotakis, Dimitris Kalimeris and Thanasis Lianeas “Improving Selfish Routing for Risk-Averse Players” In Web and Internet Economics - 11th International Conference, WINE 2015, Amsterdam, The Netherlands, December 9-12, 2015, Proceedings 9470, Lecture Notes in Computer Science Springer, 2015, pp. 328–342
- Dimitris Fotakis, Alexis C. Kaporis and Paul G. Spirakis “Atomic Congestion Games: Fast, Myopic and Concurrent” In Algorithmic Game Theory, First International Symposium, SAGT 2008, Paderborn, Germany, April 30-May 2, 2008. Proceedings 4997, Lecture Notes in Computer Science Springer, 2008, pp. 121–132
- Dimitris Fotakis, Alexis C. Kaporis and Paul G. Spirakis “Efficient Methods for Selfish Network Design” In Automata, Languages and Programming, 36th Internatilonal Colloquium, ICALP 2009, Rhodes, Greece, July 5-12, 2009, Proceedings, Part II 5556, Lecture Notes in Computer Science Springer, 2009, pp. 459–471
- Dimitris Fotakis, Spyros Kontogiannis and Paul Spirakis “Selfish unsplittable flows” Automata, Languages and Programming: Algorithms and Complexity (ICALP-A 2004)Automata, Languages and Programming: Algorithms and Complexity 2004 In Theoretical Computer Science 348.2–3, 2005, pp. 226–239 DOI: http://dx.doi.org/10.1016/j.tcs.2005.09.024
- “Node-Max-Cut and the Complexity of Equilibrium in Linear Weighted Congestion Games” In 47th International Colloquium on Automata, Languages, and Programming, ICALP 2020, July 8-11, 2020, Saarbrücken, Germany (Virtual Conference) 168, LIPIcs Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020, pp. 50:1–50:19
- “On the Hardness of Network Design for Bottleneck Routing Games” In Algorithmic Game Theory - 5th International Symposium, SAGT 2012, Barcelona, Spain, October 22-23, 2012. Proceedings 7615, Lecture Notes in Computer Science Springer, 2012, pp. 156–167
- “Computing Nash equilibria for scheduling on restricted parallel links” In Proceedings of the 36th Annual ACM Symposium on Theory of Computing, Chicago, IL, USA, June 13-16, 2004 ACM, 2004, pp. 613–622
- Yiannis Giannakopoulos, Georgy Noarov and Andreas S. Schulz “Computing Approximate Equilibria in Weighted Congestion Games via Best-Responses” In Math. Oper. Res. 47.1, 2022, pp. 643–664
- “A Unifying Approximate Potential for Weighted Congestion Games” In Theory Comput. Syst. 67.4, 2023, pp. 855–876
- Martin Grötschel, László Lovász and Alexander Schrijver “Geometric Algorithms and Combinatorial Optimization” 2, Algorithms and Combinatorics Springer, 1988
- “The On-Line Shortest Path Problem Under Partial Monitoring” In J. Mach. Learn. Res. 8, 2007, pp. 2369–2403
- “The On-Line Shortest Path Problem Under Partial Monitoring.” In Journal of Machine Learning Research 8.10, 2007
- Elad Hazan “Introduction to Online Convex Optimization” In CoRR abs/1909.05207, 2019 URL: http://arxiv.org/abs/1909.05207
- Amélie Heliou, Johanne Cohen and Panayotis Mertikopoulos “Learning with Bandit Feedback in Potential Games” In Advances in Neural Information Processing Systems 30 Curran Associates, Inc., 2017 URL: https://papers.nips.cc/paper˙files/paper/2017/hash/39ae2ed11b14a4ccb41d35e9d1ba5d11-Abstract.html
- Tim Hoheisel, Maxime Laborde and Adam Oberman “On proximal point-type algorithms for weakly convex functions and their connection to the backward euler method” In Optimization Online ()
- “No-regret learning in games with noisy feedback: Faster rates and adaptivity via learning rate separation” In NeurIPS, 2022 URL: http://papers.nips.cc/paper%5C˙files/paper/2022/hash/2abad9fd438b40604ddaabe75e6c51dd-Abstract-Conference.html
- “Efficient algorithms for online decision problems” Learning Theory 2003 In Journal of Computer and System Sciences 71.3, 2005, pp. 291–307 DOI: https://doi.org/10.1016/j.jcss.2004.10.016
- Bart Keijzer, Guido Schäfer and Orestis A. Telelis “On the Inefficiency of Equilibria in Linear Bottleneck Congestion Games” In Algorithmic Game Theory 6386, Lecture Notes in Computer Science Springer Berlin Heidelberg, 2010, pp. 335–346 DOI: 10.1007/978-3-642-16170-4˙29
- Pieter Kleer “Sampling from the Gibbs Distribution in Congestion Games” In EC ’21: The 22nd ACM Conference on Economics and Computation, Budapest, Hungary, July 18-23, 2021 ACM, 2021, pp. 679–680
- “Computation and efficiency of potential function minimizers of combinatorial congestion games” In Math. Program. 190.1, 2021, pp. 523–560
- Elias Koutsoupias and Christos H. Papadimitriou “Worst-case Equilibria” In STACS, 1999, pp. 404–413
- “Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs” In Advances in Neural Information Processing Systems 33 Curran Associates, Inc., 2020, pp. 15522–15533 URL: https://proceedings.neurips.cc/paper/2020/hash/b2ea5e977c5fc1ccfa74171a9723dd61-Abstract.html
- “Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games” In International Conference on Learning Representations, 2022 URL: https://openreview.net/forum?id=gfwON7rAm4
- Marios Mavronicolas and Paul G. Spirakis “The price of selfish routing” In Proceedings on 33rd Annual ACM Symposium on Theory of Computing, July 6-8, 2001, Heraklion, Crete, Greece ACM, 2001, pp. 510–519
- H Brendan McMahan and Avrim Blum “Online geometric optimization in the bandit setting against an adaptive adversary” In Learning Theory: 17th Annual Conference on Learning Theory, COLT 2004, Banff, Canada, July 1-4, 2004. Proceedings 17, 2004, pp. 109–123 Springer
- “Learning in games with continuous action sets and unknown payoff functions” In Math. Program. 173.1-2, 2019, pp. 465–507
- “Potential Games” In Games and Economic Behavior, 1996, pp. 124–143
- Dov Monderer and Lloyd S Shapley “Potential games” In Games and economic behavior 14.1 Elsevier, 1996, pp. 124–143
- “An Efficient Algorithm for Learning with Semi-bandit Feedback” In Algorithmic Learning Theory - 24th International Conference, ALT 2013, Singapore, October 6-9, 2013. Proceedings 8139, Lecture Notes in Computer Science Springer, 2013, pp. 234–248
- Gerasimos Palaiopanos, Ioannis Panageas and Georgios Piliouras “Multiplicative Weights Update with Constant Step-Size in Congestion Games: Convergence, Limit Cycles and Chaos” In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, 2017, pp. 5872–5882
- “Semi Bandit dynamics in Congestion Games: Convergence to Nash Equilibrium and No-Regret Guarantees.”, 2023
- “Proximal algorithms” In Foundations and trends® in Optimization 1.3 Now Publishers, Inc., 2014, pp. 127–239
- Georgios Piliouras, Ryann Sim and Stratis Skoulakis “Beyond Time-Average Convergence: Near-Optimal Uncoupled Online Learning via Clairvoyant Multiplicative Weights Update” In NeurIPS, 2022 URL: http://papers.nips.cc/paper%5C˙files/paper/2022/hash/8bd5148caced2d73cea7b6961a874a49-Abstract-Conference.html
- Robert W Rosenthal “A class of games possessing pure-strategy Nash equilibria” In International Journal of Game Theory 2 Physica-Verlag, 1973, pp. 65–67
- Tim Roughgarden “Intrinsic robustness of the price of anarchy” In Proc. of STOC, 2009, pp. 513–522
- “How bad is selfish routing?” In Journal of the ACM (JACM) 49.2 ACM, 2002, pp. 236–259
- “No-regret dynamics and fictitious play” In Journal of Economic Theory 148.2, 2013, pp. 825–842
- Dong Quan Vu, Kimon Antonakopoulos and Panayotis Mertikopoulos “Fast Routing under Uncertainty: Adaptive Learning in Congestion Games via Exponential Weights” In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, 2021, pp. 14708–14720
- “Learning in Games with Lossy Feedback” In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, 2018, pp. 5140–5150
- “Return of the bias: Almost minimax optimal high probability bounds for adversarial linear bandits” ISSN: 2640-3498 In Proceedings of Thirty Fifth Conference on Learning Theory PMLR, 2022, pp. 3285–3312 URL: https://proceedings.mlr.press/v178/zimmert22b.html
- Martin Zinkevich “Online Convex Programming and Generalized Infinitesimal Gradient Ascent” In Machine Learning, Proceedings of the Twentieth International Conference (ICML 2003), August 21-24, 2003, Washington, DC, USA AAAI Press, 2003, pp. 928–936