Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Coordinated Exploration via Intrinsic Rewards for Multi-Agent Reinforcement Learning (1905.12127v3)

Published 28 May 2019 in cs.LG, cs.AI, cs.MA, and stat.ML

Abstract: Solving tasks with sparse rewards is one of the most important challenges in reinforcement learning. In the single-agent setting, this challenge is addressed by introducing intrinsic rewards that motivate agents to explore unseen regions of their state spaces; however, applying these techniques naively to the multi-agent setting results in agents exploring independently, without any coordination among themselves. Exploration in cooperative multi-agent settings can be accelerated and improved if agents coordinate their exploration. In this paper we introduce a framework for designing intrinsic rewards which consider what other agents have explored such that the agents can coordinate. Then, we develop an approach for learning how to dynamically select between several exploration modalities to maximize extrinsic rewards. Concretely, we formulate the approach as a hierarchical policy where a high-level controller selects among sets of policies trained on diverse intrinsic rewards and the low-level controllers learn the action policies of all agents under these specific rewards. We demonstrate the effectiveness of the proposed approach in cooperative domains with sparse rewards where state-of-the-art methods fail and challenging multi-stage tasks that necessitate changing modes of coordination.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Shariq Iqbal (14 papers)
  2. Fei Sha (88 papers)
Citations (48)