Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning (2211.17116v1)

Published 30 Nov 2022 in cs.LG, cs.AI, cs.MA, and math.OC

Abstract: We study a multi-agent reinforcement learning (MARL) problem where the agents interact over a given network. The goal of the agents is to cooperatively maximize the average of their entropy-regularized long-term rewards. To overcome the curse of dimensionality and to reduce communication, we propose a Localized Policy Iteration (LPI) algorithm that provably learns a near-globally-optimal policy using only local information. In particular, we show that, despite restricting each agent's attention to only its $\kappa$-hop neighborhood, the agents are able to learn a policy with an optimality gap that decays polynomially in $\kappa$. In addition, we show the finite-sample convergence of LPI to the global optimal policy, which explicitly captures the trade-off between optimality and computational complexity in choosing $\kappa$. Numerical simulations demonstrate the effectiveness of LPI.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yizhou Zhang (18 papers)
  2. Guannan Qu (48 papers)
  3. Pan Xu (68 papers)
  4. Yiheng Lin (50 papers)
  5. Zaiwei Chen (21 papers)
  6. Adam Wierman (132 papers)
Citations (24)

Summary

We haven't generated a summary for this paper yet.