Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Polynomial Convergence of Bandit No-Regret Dynamics in Congestion Games (2401.09628v1)

Published 17 Jan 2024 in cs.GT

Abstract: We introduce an online learning algorithm in the bandit feedback model that, once adopted by all agents of a congestion game, results in game-dynamics that converge to an $\epsilon$-approximate Nash Equilibrium in a polynomial number of rounds with respect to $1/\epsilon$, the number of players and the number of available resources. The proposed algorithm also guarantees sublinear regret to any agent adopting it. As a result, our work answers an open question from arXiv:2206.01880 and extends the recent results of arXiv:2306.15543 to the bandit feedback model. We additionally establish that our online learning algorithm can be implemented in polynomial time for the important special case of Network Congestion Games on Directed Acyclic Graphs (DAG) by constructing an exact $1$-barycentric spanner for DAGs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (78)
  1. Jacob D Abernethy, Elad Hazan and Alexander Rakhlin “Competing in the dark: An efficient algorithm for bandit linear optimization”, 2009
  2. “Near-optimal no-regret learning for correlated equilibria in multi-player general-sum games” In STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 - 24, 2022 ACM, 2022, pp. 736–749 DOI: 10.1145/3519935.3520031
  3. “On Last-Iterate Convergence Beyond Zero-Sum Games” In International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA 162, Proceedings of Machine Learning Research PMLR, 2022, pp. 536–581
  4. “Uncoupled Learning Dynamics with O(log T) Swap Regret in Multiplayer Games” In NeurIPS, 2022 URL: http://papers.nips.cc/paper%5C˙files/paper/2022/hash/15d45097f9806983f0629a77e93ee60f-Abstract-Conference.html
  5. Haris Angelidakis, Dimitris Fotakis and Thanasis Lianeas “Stochastic Congestion Games with Risk-Averse Players” In Algorithmic Game Theory - 6th International Symposium, SAGT 2013, Aachen, Germany, October 21-23, 2013. Proceedings 8146, Lecture Notes in Computer Science Springer, 2013, pp. 86–97
  6. Sanjeev Arora, Elad Hazan and Satyen Kale “The Multiplicative Weights Update Method: a Meta-Algorithm and Applications” In Theory Comput. 8.1, 2012, pp. 121–164
  7. “Minimax Policies for Adversarial and Stochastic Bandits” In COLT 2009 - The 22nd Conference on Learning Theory, 2009
  8. Jean-Yves Audibert, Sébastien Bubeck and Gábor Lugosi “Regret in Online Combinatorial Optimization” In Math. Oper. Res. 39.1 Linthicum, MD, USA: INFORMS, 2014, pp. 31–45
  9. “The Nonstochastic Multiarmed Bandit Problem” In SIAM J. Comput. 32.1, 2002, pp. 48–77
  10. Baruch Awerbuch and Robert D Kleinberg “Adaptive routing with end-to-end feedback: Distributed learning and geometric approaches” In Proceedings of the thirty-sixth annual ACM symposium on Theory of computing, 2004, pp. 45–53
  11. Baruch Awerbuch and Robert D. Kleinberg “Adaptive Routing with End-to-End Feedback: Distributed Learning and Geometric Approaches” In Proceedings of the Thirty-Sixth Annual ACM Symposium on Theory of Computing, STOC ’04, 2004, pp. 45–53
  12. “Communication complexity of Nash equilibrium in potential games (extended abstract)” In 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Durham, NC, USA, November 16-19, 2020 IEEE, 2020, pp. 1439–1445
  13. “An efficient high-probability algorithm for Linear Bandits” arXiv:1610.02072 [cs] arXiv, 2016 DOI: 10.48550/arXiv.1610.02072
  14. Sébastien Bubeck, Nicolò Cesa-Bianchi and Sham M. Kakade “Towards Minimax Policies for Online Linear Optimization with Bandit Feedback” In COLT 2012 - The 25th Annual Conference on Learning Theory, June 25-27, 2012, Edinburgh, Scotland 23, JMLR Proceedings JMLR.org, 2012, pp. 41.1–41.14
  15. “On Approximate Pure Nash Equilibria in Weighted Congestion Games with Polynomial Latencies” In 46th International Colloquium on Automata, Languages, and Programming, ICALP 2019, July 9-12, 2019, Patras, Greece 132, LIPIcs Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2019, pp. 133:1–133:12
  16. “Computing Better Approximate Pure Nash Equilibria in Cut Games via Semidefinite Programming” In Proceedings of the 55th Annual ACM Symposium on Theory of Computing, STOC 2023, Orlando, FL, USA, June 20-23, 2023 ACM, 2023, pp. 710–722
  17. “Approximate pure nash equilibria in weighted congestion games: existence, efficient computation, and structure” In Proceedings of the 13th ACM Conference on Electronic Commerce, EC 2012, Valencia, Spain, June 4-8, 2012 ACM, 2012, pp. 284–301
  18. “Efficient Computation of Approximate Pure Nash Equilibria in Congestion Games” In IEEE 52nd Annual Symposium on Foundations of Computer Science, FOCS 2011, Palm Springs, CA, USA, October 22-25, 2011 IEEE Computer Society, 2011, pp. 532–541
  19. Constantin Carathéodory “Über den Variabilitätsbereich der Koeffizienten von Potenzreihen, die gegebene Werte nicht annehmen” In Mathematische Annalen 64.1 Springer, 1907, pp. 95–115
  20. “Combinatorial bandits” In Journal of Computer and System Sciences 78.5 Elsevier, 2012, pp. 1404–1422
  21. “Combinatorial bandits” In J. Comput. Syst. Sci. 78.5, 2012, pp. 1404–1422
  22. “Generalized mirror descents in congestion games” In Artificial Intelligence 241 Elsevier, 2016, pp. 217–243
  23. Liyu Chen, Haipeng Luo and Chen-Yu Wei “Impossible tuning made possible: A new expert algorithm and its applications” In Conference on Learning Theory, 2021, pp. 1216–1259 PMLR
  24. “Convergence to approximate Nash equilibria in congestion games” In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2007, New Orleans, Louisiana, USA, January 7-9, 2007 SIAM, 2007, pp. 169–178
  25. “The Price of Anarchy of Finite Congestion Games” In STOC, 2005, pp. 67–73
  26. “Existence and Complexity of Approximate Equilibria in Weighted Congestion Games” In Math. Oper. Res. 48.1, 2023, pp. 583–602
  27. Johanne Cohen, Amélie Héliou and Panayotis Mertikopoulos “Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence” In Algorithmic Game Theory - 10th International Symposium, SAGT 2017, L’Aquila, Italy, September 12-14, 2017, Proceedings 10504, Lecture Notes in Computer Science Springer, 2017, pp. 252–263
  28. Patrick L Combettes and Jean-Christophe Pesquet “Proximal splitting methods in signal processing” In Fixed-point algorithms for inverse problems in science and engineering Springer, 2011, pp. 185–212
  29. “Learning in Congestion Games with Bandit Feedback” NeurIPS, 2022
  30. Varsha Dani, Thomas P. Hayes and Sham M. Kakade “The Price of Bandit Information for Online Optimization” In Proceedings of the 20th International Conference on Neural Information Processing Systems, NIPS’07 Vancouver, British Columbia, Canada: Curran Associates Inc., 2007, pp. 345–352
  31. Varsha Dani, Sham M Kakade and Thomas Hayes “The price of bandit information for online optimization” In Advances in Neural Information Processing Systems 20, 2007
  32. Constantinos Daskalakis, Maxwell Fishelson and Noah Golowich “Near-Optimal No-Regret Learning in General Games” In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, 2021, pp. 27604–27616
  33. “Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence” In International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA 162, Proceedings of Machine Learning Research PMLR, 2022, pp. 5166–5220
  34. Eyal Even-Dar, Yishay Mansour and Uri Nadav “On the convergence of regret minimization dynamics in concave games” In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 - June 2, 2009 ACM, 2009, pp. 523–532
  35. A. Fabrikant, C. Papadimitriou and K. Talwar “The complexity of pure Nash equilibria” In ACM Symposium on Theory of Computing (STOC), 2004, pp. 604–612 ACM
  36. “Near-Optimal No-Regret Learning Dynamics for General Convex Games” In NeurIPS, 2022
  37. Abraham D Flaxman, Adam Tauman Kalai and H Brendan McMahan “Online convex optimization in the bandit setting: gradient descent without a gradient” In arXiv preprint cs/0408007, 2004
  38. Dimitris Fotakis, Dimitris Kalimeris and Thanasis Lianeas “Improving Selfish Routing for Risk-Averse Players” In Web and Internet Economics - 11th International Conference, WINE 2015, Amsterdam, The Netherlands, December 9-12, 2015, Proceedings 9470, Lecture Notes in Computer Science Springer, 2015, pp. 328–342
  39. Dimitris Fotakis, Alexis C. Kaporis and Paul G. Spirakis “Atomic Congestion Games: Fast, Myopic and Concurrent” In Algorithmic Game Theory, First International Symposium, SAGT 2008, Paderborn, Germany, April 30-May 2, 2008. Proceedings 4997, Lecture Notes in Computer Science Springer, 2008, pp. 121–132
  40. Dimitris Fotakis, Alexis C. Kaporis and Paul G. Spirakis “Efficient Methods for Selfish Network Design” In Automata, Languages and Programming, 36th Internatilonal Colloquium, ICALP 2009, Rhodes, Greece, July 5-12, 2009, Proceedings, Part II 5556, Lecture Notes in Computer Science Springer, 2009, pp. 459–471
  41. Dimitris Fotakis, Spyros Kontogiannis and Paul Spirakis “Selfish unsplittable flows” Automata, Languages and Programming: Algorithms and Complexity (ICALP-A 2004)Automata, Languages and Programming: Algorithms and Complexity 2004 In Theoretical Computer Science 348.2–3, 2005, pp. 226–239 DOI: http://dx.doi.org/10.1016/j.tcs.2005.09.024
  42. “Node-Max-Cut and the Complexity of Equilibrium in Linear Weighted Congestion Games” In 47th International Colloquium on Automata, Languages, and Programming, ICALP 2020, July 8-11, 2020, Saarbrücken, Germany (Virtual Conference) 168, LIPIcs Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020, pp. 50:1–50:19
  43. “On the Hardness of Network Design for Bottleneck Routing Games” In Algorithmic Game Theory - 5th International Symposium, SAGT 2012, Barcelona, Spain, October 22-23, 2012. Proceedings 7615, Lecture Notes in Computer Science Springer, 2012, pp. 156–167
  44. “Computing Nash equilibria for scheduling on restricted parallel links” In Proceedings of the 36th Annual ACM Symposium on Theory of Computing, Chicago, IL, USA, June 13-16, 2004 ACM, 2004, pp. 613–622
  45. Yiannis Giannakopoulos, Georgy Noarov and Andreas S. Schulz “Computing Approximate Equilibria in Weighted Congestion Games via Best-Responses” In Math. Oper. Res. 47.1, 2022, pp. 643–664
  46. “A Unifying Approximate Potential for Weighted Congestion Games” In Theory Comput. Syst. 67.4, 2023, pp. 855–876
  47. Martin Grötschel, László Lovász and Alexander Schrijver “Geometric Algorithms and Combinatorial Optimization” 2, Algorithms and Combinatorics Springer, 1988
  48. “The On-Line Shortest Path Problem Under Partial Monitoring” In J. Mach. Learn. Res. 8, 2007, pp. 2369–2403
  49. “The On-Line Shortest Path Problem Under Partial Monitoring.” In Journal of Machine Learning Research 8.10, 2007
  50. Elad Hazan “Introduction to Online Convex Optimization” In CoRR abs/1909.05207, 2019 URL: http://arxiv.org/abs/1909.05207
  51. Amélie Heliou, Johanne Cohen and Panayotis Mertikopoulos “Learning with Bandit Feedback in Potential Games” In Advances in Neural Information Processing Systems 30 Curran Associates, Inc., 2017 URL: https://papers.nips.cc/paper˙files/paper/2017/hash/39ae2ed11b14a4ccb41d35e9d1ba5d11-Abstract.html
  52. Tim Hoheisel, Maxime Laborde and Adam Oberman “On proximal point-type algorithms for weakly convex functions and their connection to the backward euler method” In Optimization Online ()
  53. “No-regret learning in games with noisy feedback: Faster rates and adaptivity via learning rate separation” In NeurIPS, 2022 URL: http://papers.nips.cc/paper%5C˙files/paper/2022/hash/2abad9fd438b40604ddaabe75e6c51dd-Abstract-Conference.html
  54. “Efficient algorithms for online decision problems” Learning Theory 2003 In Journal of Computer and System Sciences 71.3, 2005, pp. 291–307 DOI: https://doi.org/10.1016/j.jcss.2004.10.016
  55. Bart Keijzer, Guido Schäfer and Orestis A. Telelis “On the Inefficiency of Equilibria in Linear Bottleneck Congestion Games” In Algorithmic Game Theory 6386, Lecture Notes in Computer Science Springer Berlin Heidelberg, 2010, pp. 335–346 DOI: 10.1007/978-3-642-16170-4˙29
  56. Pieter Kleer “Sampling from the Gibbs Distribution in Congestion Games” In EC ’21: The 22nd ACM Conference on Economics and Computation, Budapest, Hungary, July 18-23, 2021 ACM, 2021, pp. 679–680
  57. “Computation and efficiency of potential function minimizers of combinatorial congestion games” In Math. Program. 190.1, 2021, pp. 523–560
  58. Elias Koutsoupias and Christos H. Papadimitriou “Worst-case Equilibria” In STACS, 1999, pp. 404–413
  59. “Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs” In Advances in Neural Information Processing Systems 33 Curran Associates, Inc., 2020, pp. 15522–15533 URL: https://proceedings.neurips.cc/paper/2020/hash/b2ea5e977c5fc1ccfa74171a9723dd61-Abstract.html
  60. “Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games” In International Conference on Learning Representations, 2022 URL: https://openreview.net/forum?id=gfwON7rAm4
  61. Marios Mavronicolas and Paul G. Spirakis “The price of selfish routing” In Proceedings on 33rd Annual ACM Symposium on Theory of Computing, July 6-8, 2001, Heraklion, Crete, Greece ACM, 2001, pp. 510–519
  62. H Brendan McMahan and Avrim Blum “Online geometric optimization in the bandit setting against an adaptive adversary” In Learning Theory: 17th Annual Conference on Learning Theory, COLT 2004, Banff, Canada, July 1-4, 2004. Proceedings 17, 2004, pp. 109–123 Springer
  63. “Learning in games with continuous action sets and unknown payoff functions” In Math. Program. 173.1-2, 2019, pp. 465–507
  64. “Potential Games” In Games and Economic Behavior, 1996, pp. 124–143
  65. Dov Monderer and Lloyd S Shapley “Potential games” In Games and economic behavior 14.1 Elsevier, 1996, pp. 124–143
  66. “An Efficient Algorithm for Learning with Semi-bandit Feedback” In Algorithmic Learning Theory - 24th International Conference, ALT 2013, Singapore, October 6-9, 2013. Proceedings 8139, Lecture Notes in Computer Science Springer, 2013, pp. 234–248
  67. Gerasimos Palaiopanos, Ioannis Panageas and Georgios Piliouras “Multiplicative Weights Update with Constant Step-Size in Congestion Games: Convergence, Limit Cycles and Chaos” In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, 2017, pp. 5872–5882
  68. “Semi Bandit dynamics in Congestion Games: Convergence to Nash Equilibrium and No-Regret Guarantees.”, 2023
  69. “Proximal algorithms” In Foundations and trends® in Optimization 1.3 Now Publishers, Inc., 2014, pp. 127–239
  70. Georgios Piliouras, Ryann Sim and Stratis Skoulakis “Beyond Time-Average Convergence: Near-Optimal Uncoupled Online Learning via Clairvoyant Multiplicative Weights Update” In NeurIPS, 2022 URL: http://papers.nips.cc/paper%5C˙files/paper/2022/hash/8bd5148caced2d73cea7b6961a874a49-Abstract-Conference.html
  71. Robert W Rosenthal “A class of games possessing pure-strategy Nash equilibria” In International Journal of Game Theory 2 Physica-Verlag, 1973, pp. 65–67
  72. Tim Roughgarden “Intrinsic robustness of the price of anarchy” In Proc. of STOC, 2009, pp. 513–522
  73. “How bad is selfish routing?” In Journal of the ACM (JACM) 49.2 ACM, 2002, pp. 236–259
  74. “No-regret dynamics and fictitious play” In Journal of Economic Theory 148.2, 2013, pp. 825–842
  75. Dong Quan Vu, Kimon Antonakopoulos and Panayotis Mertikopoulos “Fast Routing under Uncertainty: Adaptive Learning in Congestion Games via Exponential Weights” In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, 2021, pp. 14708–14720
  76. “Learning in Games with Lossy Feedback” In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, 2018, pp. 5140–5150
  77. “Return of the bias: Almost minimax optimal high probability bounds for adversarial linear bandits” ISSN: 2640-3498 In Proceedings of Thirty Fifth Conference on Learning Theory PMLR, 2022, pp. 3285–3312 URL: https://proceedings.mlr.press/v178/zimmert22b.html
  78. Martin Zinkevich “Online Convex Programming and Generalized Infinitesimal Gradient Ascent” In Machine Learning, Proceedings of the Twentieth International Conference (ICML 2003), August 21-24, 2003, Washington, DC, USA AAAI Press, 2003, pp. 928–936
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com