Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Robust Multi-Agent Reinforcement Learning by Mutual Information Regularization (2310.09833v3)

Published 15 Oct 2023 in cs.LG and cs.AI

Abstract: In multi-agent reinforcement learning (MARL), ensuring robustness against unpredictable or worst-case actions by allies is crucial for real-world deployment. Existing robust MARL methods either approximate or enumerate all possible threat scenarios against worst-case adversaries, leading to computational intensity and reduced robustness. In contrast, human learning efficiently acquires robust behaviors in daily life without preparing for every possible threat. Inspired by this, we frame robust MARL as an inference problem, with worst-case robustness implicitly optimized under all threat scenarios via off-policy evaluation. Within this framework, we demonstrate that Mutual Information Regularization as Robust Regularization (MIR3) during routine training is guaranteed to maximize a lower bound on robustness, without the need for adversaries. Further insights show that MIR3 acts as an information bottleneck, preventing agents from over-reacting to others and aligning policies with robust action priors. In the presence of worst-case adversaries, our MIR3 significantly surpasses baseline methods in robustness and training efficiency while maintaining cooperative performance in StarCraft II and robot swarm control. When deploying the robot swarm control algorithm in the real world, our method also outperforms the best baseline by 14.29%.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Simin Li (14 papers)
  2. Ruixiao Xu (3 papers)
  3. Pu Feng (8 papers)
  4. Yaodong Yang (169 papers)
  5. Xianglong Liu (128 papers)
  6. Jingqiao Xiu (3 papers)
  7. Yuwei Zheng (5 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.