Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Exploration-Exploitation in Multi-Agent Competition: Convergence with Bounded Rationality (2106.12928v1)

Published 24 Jun 2021 in cs.GT, cs.LG, cs.MA, econ.TH, and math.DS

Abstract: The interplay between exploration and exploitation in competitive multi-agent learning is still far from being well understood. Motivated by this, we study smooth Q-learning, a prototypical learning model that explicitly captures the balance between game rewards and exploration costs. We show that Q-learning always converges to the unique quantal-response equilibrium (QRE), the standard solution concept for games under bounded rationality, in weighted zero-sum polymatrix games with heterogeneous learning agents using positive exploration rates. Complementing recent results about convergence in weighted potential games, we show that fast convergence of Q-learning in competitive settings is obtained regardless of the number of agents and without any need for parameter fine-tuning. As showcased by our experiments in network zero-sum games, these theoretical results provide the necessary guarantees for an algorithmic approach to the currently open problem of equilibrium selection in competitive multi-agent settings.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Stefanos Leonardos (33 papers)
  2. Georgios Piliouras (130 papers)
  3. Kelly Spendlove (9 papers)
Citations (26)

Summary

We haven't generated a summary for this paper yet.