Multiagent Soft Q-Learning (1804.09817v1)

Published 25 Apr 2018 in cs.AI

Abstract: Policy gradient methods are often applied to reinforcement learning in continuous multiagent games. These methods perform local search in the joint-action space, and as we show, they are susceptable to a game-theoretic pathology known as relative overgeneralization. To resolve this issue, we propose Multiagent Soft Q-learning, which can be seen as the analogue of applying Q-learning to continuous controls. We compare our method to MADDPG, a state-of-the-art approach, and show that our method achieves better coordination in multiagent cooperative tasks, converging to better local optima in the joint action space.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (4)

Ermo Wei (2 papers)
Drew Wicke (2 papers)
David Freelan (1 paper)
Sean Luke (3 papers)

Citations (76)

View on Semantic Scholar

Multiagent Soft Q-Learning (1804.09817v1)

Related Papers