Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Reaching Consensus in Cooperative Multi-Agent Reinforcement Learning with Goal Imagination (2403.03172v1)

Published 5 Mar 2024 in cs.AI and cs.LG

Abstract: Reaching consensus is key to multi-agent coordination. To accomplish a cooperative task, agents need to coherently select optimal joint actions to maximize the team reward. However, current cooperative multi-agent reinforcement learning (MARL) methods usually do not explicitly take consensus into consideration, which may cause miscoordination problem. In this paper, we propose a model-based consensus mechanism to explicitly coordinate multiple agents. The proposed Multi-agent Goal Imagination (MAGI) framework guides agents to reach consensus with an Imagined common goal. The common goal is an achievable state with high value, which is obtained by sampling from the distribution of future states. We directly model this distribution with a self-supervised generative model, thus alleviating the "curse of dimensinality" problem induced by multi-agent multi-step policy rollout commonly used in model-based methods. We show that such efficient consensus mechanism can guide all agents cooperatively reaching valuable future states. Results on Multi-agent Particle-Environments and Google Research Football environment demonstrate the superiority of MAGI in both sample efficiency and performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Feudal multi-agent hierarchies for cooperative reinforcement learning. arXiv preprint arXiv:1901.08492.
  2. Dota 2 with large scale deep reinforcement learning. arXiv preprint arXiv:1912.06680.
  3. An overview of recent progress in the study of distributed multi-agent coordination. IEEE Transactions on Industrial informatics, 9(1): 427–438.
  4. HyperNetworks. In 5th International Conference on Learning Representations, ICLR 2017.
  5. Multi-agent deep reinforcement learning: a survey. Artificial Intelligence Review, 55(2): 895–943.
  6. Guided deep reinforcement learning for swarm systems. arXiv preprint arXiv:1709.06011.
  7. Actor-attention-critic for multi-agent reinforcement learning. In International Conference on Machine Learning, 2961–2970.
  8. Learning dynamics model in reinforcement learning by incorporating the long term future. stat, 1050: 16.
  9. Auto-Encoding Variational Bayes. stat, 1050: 1.
  10. Actor-critic algorithms. In Advances in neural information processing systems, 1008–1014.
  11. Multi-agent reinforcement learning with multi-step generative models. In Conference on Robot Learning, 776–790.
  12. Continuous control with deep reinforcement learning. 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings.
  13. Littman, M. L. 2001. Value-function reinforcement learning in Markov games. Cognitive systems research, 2(1): 55–66.
  14. Cooperative exploration for multi-agent deep reinforcement learning. In International Conference on Machine Learning, 6826–6836. PMLR.
  15. Multi-agent actor-critic for mixed cooperative-competitive environments. In Advances in neural information processing systems, 6379–6390.
  16. Independent reinforcement learners in cooperative markov games: a survey regarding coordination problems. The Knowledge Engineering Review, 27(1): 1–31.
  17. Prediction and control with temporal segment models. In International Conference on Machine Learning, 2459–2468. PMLR.
  18. Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. In IEEE International Conference on Robotics and Automation, 7559–7566. IEEE.
  19. Deep decentralized multi-task multi-agent reinforcement learning under partial observability. In International Conference on Machine Learning, 2681–2690. PMLR.
  20. Facmac: Factored multi-agent centralised policy gradients. Advances in Neural Information Processing Systems, 34: 12208–12221.
  21. Imagination-augmented agents for deep reinforcement learning. In Proceedings of the 31st International Conference on Neural Information Processing Systems, 5694–5705.
  22. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. In International Conference on Machine Learning, 4295–4304.
  23. A survey of consensus problems in multi-agent coordination. In Proceedings of the 2005, American Control Conference, 2005., 1859–1864. IEEE.
  24. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
  25. Learning structured output representation using deep conditional generative models. Advances in neural information processing systems, 28: 3483–3491.
  26. QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning. In International Conference on Machine Learning, 5887–5896.
  27. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward. In International Conference On Autonomous Agents and Multi-Agent Systems, 2085–2087.
  28. Sutton, R. S. 1991. Dyna, an integrated architecture for learning, planning, and reacting. ACM Sigart Bulletin, 2(4): 160–163.
  29. Reinforcement learning: An introduction. MIT press.
  30. Talvitie, E. 2014. Model regularization for stable sample rollouts. In Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence, 780–789.
  31. Multiagent cooperation and competition with deep reinforcement learning. PloS one, 12(4): e0172395.
  32. Feudal networks for hierarchical reinforcement learning. In International Conference on Machine Learning, 3540–3549.
  33. Towards Playing Full MOBA Games with Deep Reinforcement Learning. arXiv e-prints, arXiv–2011.
  34. A multi-agent framework for packet routing in wireless sensor networks. sensors, 15(5): 10026–10047.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Liangzhou Wang (1 paper)
  2. Kaiwen Zhu (6 papers)
  3. Fengming Zhu (8 papers)
  4. Xinghu Yao (2 papers)
  5. Shujie Zhang (6 papers)
  6. Deheng Ye (50 papers)
  7. Haobo Fu (14 papers)
  8. Qiang Fu (159 papers)
  9. Wei Yang (349 papers)