Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Scalable Model-based Policy Optimization for Decentralized Networked Systems (2207.06559v2)

Published 13 Jul 2022 in cs.LG, cs.AI, cs.MA, math.OC, and stat.ML

Abstract: Reinforcement learning algorithms require a large amount of samples; this often limits their real-world applications on even simple tasks. Such a challenge is more outstanding in multi-agent tasks, as each step of operation is more costly requiring communications or shifting or resources. This work aims to improve data efficiency of multi-agent control by model-based learning. We consider networked systems where agents are cooperative and communicate only locally with their neighbors, and propose the decentralized model-based policy optimization framework (DMPO). In our method, each agent learns a dynamic model to predict future states and broadcast their predictions by communication, and then the policies are trained under the model rollouts. To alleviate the bias of model-generated data, we restrain the model usage for generating myopic rollouts, thus reducing the compounding error of model generation. To pertain the independence of policy update, we introduce extended value function and theoretically prove that the resulting policy gradient is a close approximation to true policy gradients. We evaluate our algorithm on several benchmarks for intelligent transportation systems, which are connected autonomous vehicle control tasks (Flow and CACC) and adaptive traffic signal control (ATSC). Empirically results show that our method achieves superior data efficiency and matches the performance of model-free methods using true models.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Yali Du (63 papers)
  2. Chengdong Ma (12 papers)
  3. Yuchen Liu (156 papers)
  4. Runji Lin (18 papers)
  5. Hao Dong (175 papers)
  6. Jun Wang (991 papers)
  7. Yaodong Yang (169 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.