Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-Agent Cooperation via Unsupervised Learning of Joint Intentions (2307.02200v1)

Published 5 Jul 2023 in cs.MA

Abstract: The field of cooperative multi-agent reinforcement learning (MARL) has seen widespread use in addressing complex coordination tasks. While value decomposition methods in MARL have been popular, they have limitations in solving tasks with non-monotonic returns, restricting their general application. Our work highlights the significance of joint intentions in cooperation, which can overcome non-monotonic problems and increase the interpretability of the learning process. To this end, we present a novel MARL method that leverages learnable joint intentions. Our method employs a hierarchical framework consisting of a joint intention policy and a behavior policy to formulate the optimal cooperative policy. The joint intentions are autonomously learned in a latent space through unsupervised learning and enable the method adaptable to different agent configurations. Our results demonstrate significant performance improvements in both the StarCraft micromanagement benchmark and challenging MAgent domains, showcasing the effectiveness of our method in learning meaningful joint intentions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Deep variational information bottleneck. arXiv preprint arXiv:1612.00410, 2016.
  2. An overview of recent progress in the study of distributed multi-agent coordination. IEEE Transactions on Industrial informatics, 9(1):427–438, 2012.
  3. Diversity is all you need: Learning skills without a reward function. arXiv preprint arXiv:1802.06070, 2018.
  4. Uneven: Universal value exploration for multi-agent reinforcement learning. In International Conference on Machine Learning, pages 3930–3941. PMLR, 2021.
  5. Hypernetworks. arXiv preprint arXiv:1609.09106, 2016.
  6. Learning an embedding space for transferable robot skills. In International Conference on Learning Representations, 2018.
  7. Off-policy maximum entropy reinforcement learning: soft actor-critic with advantage weighted mixture policy (sac-awmp). arXiv preprint arXiv:2002.02829, 2020.
  8. Guided deep reinforcement learning for swarm systems. arXiv preprint arXiv:1709.06011, 2017.
  9. Actor-attention-critic for multi-agent reinforcement learning. In International Conference on Machine Learning, pages 2961–2970. PMLR, 2019.
  10. Social influence as intrinsic motivation for multi-agent deep reinforcement learning. In International Conference on Machine Learning, pages 3040–3049. PMLR, 2019.
  11. Nicholas R Jennings. Controlling cooperative problem solving in industrial multi-agent systems using joint intentions. Artificial intelligence, 75(2):195–240, 1995.
  12. Graph convolutional reinforcement learning. arXiv preprint arXiv:1810.09202, 2018.
  13. Communication in multi-agent reinforcement learning: Intention sharing. In International Conference on Learning Representations, 2020.
  14. Multi-agent game abstraction via graph attention neural network. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 7211–7218, 2020.
  15. Multi-agent actor-critic for mixed cooperative-competitive environments. arXiv preprint arXiv:1706.02275, 2017.
  16. Maven: Multi-agent variational exploration. arXiv preprint arXiv:1910.07483, 2019.
  17. Biasing coevolutionary search for optimal multiagent behaviors. IEEE Transactions on Evolutionary Computation, 10(6):629–645, 2006.
  18. WCH Prentice. Understanding leadership. Harvard Business Review, 82(1):102–102, 2004.
  19. Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. In International Conference on Machine Learning, pages 4295–4304. PMLR, 2018.
  20. The starcraft multi-agent challenge. arXiv preprint arXiv:1902.04043, 2019.
  21. Dynamics-aware unsupervised discovery of skills. arXiv preprint arXiv:1907.01657, 2019.
  22. Qtran: Learning to factorize with transformation for cooperative multi-agent reinforcement learning. In International Conference on Machine Learning, pages 5887–5896. PMLR, 2019.
  23. Ad hoc autonomous agent teams: Collaboration without pre-coordination. In Proceedings of the AAAI Conference on Artificial Intelligence, 2010.
  24. Learning to share and hide intentions using information regularization. Advances in neural information processing systems, 31, 2018.
  25. Value-decomposition networks for cooperative multi-agent learning. arXiv preprint arXiv:1706.05296, 2017.
  26. Greedy-based value representation for optimal coordination in multi-agent reinforcement learning. arXiv preprint arXiv:2112.04454, 2021.
  27. Qplex: Duplex dueling multi-agent q-learning. arXiv preprint arXiv:2008.01062, 2020.
  28. Roma: Multi-agent reinforcement learning with emergent roles. In Proceedings of the 37th International Conference on Machine Learning, 2020.
  29. Influence-based multi-agent exploration. arXiv preprint arXiv:1910.05512, 2019.
  30. Multi-agent collaboration via reward attribution decomposition. arXiv preprint arXiv:2010.08531, 2020.
  31. Magent: A many-agent reinforcement learning platform for artificial collective intelligence. In Proceedings of the AAAI Conference on Artificial Intelligence, 2018.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Shanqi Liu (15 papers)
  2. Weiwei Liu (51 papers)
  3. Wenzhou Chen (5 papers)
  4. Guanzhong Tian (13 papers)
  5. Yong Liu (721 papers)