Macro-Action-Based Deep Multi-Agent Reinforcement Learning (2004.08646v2)

Published 18 Apr 2020 in cs.LG, cs.AI, and cs.RO

Abstract: In real-world multi-robot systems, performing high-quality, collaborative behaviors requires robots to asynchronously reason about high-level action selection at varying time durations. Macro-Action Decentralized Partially Observable Markov Decision Processes (MacDec-POMDPs) provide a general framework for asynchronous decision making under uncertainty in fully cooperative multi-agent tasks. However, multi-agent deep reinforcement learning methods have only been developed for (synchronous) primitive-action problems. This paper proposes two Deep Q-Network (DQN) based methods for learning decentralized and centralized macro-action-value functions with novel macro-action trajectory replay buffers introduced for each case. Evaluations on benchmark problems and a larger domain demonstrate the advantage of learning with macro-actions over primitive-actions and the scalability of our approaches.

Authors (3)

Yuchen Xiao (22 papers)
Joshua Hoffman (2 papers)
Christopher Amato (57 papers)

Citations (27)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Macro-Action-Based Deep Multi-Agent Reinforcement Learning (2004.08646v2)

Summary

Related Papers