Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Stateful active facilitator: Coordination and Environmental Heterogeneity in Cooperative Multi-Agent Reinforcement Learning (2210.03022v3)

Published 4 Oct 2022 in cs.AI and cs.LG

Abstract: In cooperative multi-agent reinforcement learning, a team of agents works together to achieve a common goal. Different environments or tasks may require varying degrees of coordination among agents in order to achieve the goal in an optimal way. The nature of coordination will depend on the properties of the environment -- its spatial layout, distribution of obstacles, dynamics, etc. We term this variation of properties within an environment as heterogeneity. Existing literature has not sufficiently addressed the fact that different environments may have different levels of heterogeneity. We formalize the notions of coordination level and heterogeneity level of an environment and present HECOGrid, a suite of multi-agent RL environments that facilitates empirical evaluation of different MARL approaches across different levels of coordination and environmental heterogeneity by providing a quantitative control over coordination and heterogeneity levels of the environment. Further, we propose a Centralized Training Decentralized Execution learning approach called Stateful Active Facilitator (SAF) that enables agents to work efficiently in high-coordination and high-heterogeneity environments through a differentiable and shared knowledge source used during training and dynamic selection from a shared pool of policies. We evaluate SAF and compare its performance against baselines IPPO and MAPPO on HECOGrid. Our results show that SAF consistently outperforms the baselines across different tasks and different heterogeneity and coordination levels. We release the code for HECOGrid as well as all our experiments.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. Bernard J. Baars. A Cognitive Theory of Consciousness. Cambridge University Press, 1988.
  2. Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261, 2018.
  3. Dota 2 with large scale deep reinforcement learning. arXiv preprint arXiv:1912.06680, 2019.
  4. Distributed control architecture for smart surfaces. In 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp.  2018–2024. IEEE, 2010.
  5. An analysis of stochastic game theory for multiagent reinforcement learning. Technical report, Carnegie-Mellon Univ Pittsburgh Pa School of Computer Science, 2000.
  6. Geometric deep learning: going beyond euclidean data. IEEE Signal Processing Magazine, 34(4):18–42, 2017.
  7. A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 38(2):156–172, 2008.
  8. A survey on multi-agent reinforcement learning: Coordination problems. In Proceedings of 2010 IEEE/ASME International Conference on Mechatronic and Embedded Systems and Applications, pp.  81–86. IEEE, 2010.
  9. Scaling multi-agent reinforcement learning with selective parameter sharing. In ICML, 2021.
  10. Tarmac: Targeted multi-agent communication. In International Conference on Machine Learning, pp. 1538–1546. PMLR, 2019.
  11. Is independent learning all you need in the starcraft multi-agent challenge?, 2020.
  12. What is consciousness, and could machines have it? Science, 358(6362):486–492, 2017.
  13. Hidden parameter markov decision processes: A semiparametric regression approach for discovering latent task parametrizations. In IJCAI: proceedings of the conference, volume 2016, pp. 1432. NIH Public Access, 2016.
  14. Learning to communicate with deep multi-agent reinforcement learning. In NIPS, 2016.
  15. Counterfactual multi-agent policy gradients. In AAAI, 2018.
  16. Object files and schemata: Factorizing declarative and procedural knowledge in dynamical systems. In International Conference on Learning Representations, 2021a.
  17. Neural production systems. arXiv preprint arXiv:2103.01937, 2021b.
  18. Coordination among neural modules through a shared global workspace. arXiv preprint arXiv:2103.01197, 2021c.
  19. Recurrent independent mechanisms. ArXiv, abs/1909.10893, 2021d.
  20. Coordinated reinforcement learning. In ICML, volume 2, pp.  227–234. Citeseer, 2002.
  21. Cooperative multi-agent control using deep reinforcement learning. In International Conference on Autonomous Agents and Multiagent Systems, pp.  66–83. Springer, 2017.
  22. Perceiver IO: A general architecture for structured inputs & outputs. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=fILj7WpI-g.
  23. Learning attentional communication for multi-agent cooperation. In NeurIPS, 2018.
  24. Federated reinforcement learning with environment heterogeneity. In International Conference on Artificial Intelligence and Statistics, pp.  18–37. PMLR, 2022.
  25. Reinforcement learning of coordination in heterogeneous cooperative multi-agent systems. In Adaptive Agents and Multi-Agent Systems II, pp.  119–131. Springer, 2004.
  26. A maximum mutual information framework for multi-agent reinforcement learning. arXiv preprint arXiv:2006.02732, 2020a.
  27. Communication in multi-agent reinforcement learning: Intention sharing. In International Conference on Learning Representations, 2020b.
  28. Neural relational inference for interacting systems. arXiv preprint arXiv:1802.04687, 2018.
  29. Trust region policy optimisation in multi-agent reinforcement learning. arXiv preprint arXiv:2109.11251, 2021.
  30. Scalable evaluation of multi-agent reinforcement learning with melting pot. ArXiv, abs/2107.06857, 2021.
  31. Celebrating diversity in shared multi-agent reinforcement learning. arXiv preprint arXiv:2106.02195, 2021.
  32. Continuous control with deep reinforcement learning. CoRR, abs/1509.02971, 2016.
  33. Discrete-valued neural communication. ArXiv, abs/2107.02367, 2021.
  34. Multi-agent actor-critic for mixed cooperative-competitive environments. Neural Information Processing Systems (NIPS), 2017.
  35. Maven: Multi-agent variational exploration. In NeurIPS, 2019.
  36. Independent reinforcement learners in cooperative markov games: a survey regarding coordination problems. The Knowledge Engineering Review, 27(1):1–31, 2012.
  37. On the approximation of cooperative heterogeneous multi-agent reinforcement learning (marl) using mean field control (mfc). Journal of Machine Learning Research, 23(129):1–46, 2022.
  38. Emergent social learning via multi-agent reinforcement learning. In ICML, 2021.
  39. A Concise Introduction to Decentralized POMDPs. Springer International Publishing, 2016. doi: 10.1007/978-3-319-28929-8. URL https://doi.org/10.1007/978-3-319-28929-8.
  40. Neural discrete representation learning. arXiv preprint arXiv:1711.00937, 2017.
  41. Dynamic inference with neural interpreters. In NeurIPS, 2021.
  42. Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. ArXiv, abs/1803.11485, 2018.
  43. The starcraft multi-agent challenge, 2019.
  44. The graph neural network model. IEEE Transactions on Neural Networks, 20(1):61–80, 2008.
  45. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  46. Murray Shanahan. A cognitive architecture that combines internal simulation with a global workspace. Consciousness and cognition, 15(2):433–449, 2006.
  47. Learning multiagent communication with backpropagation. In NIPS, 2016.
  48. Value-decomposition networks for cooperative multi-agent learning. ArXiv, abs/1706.05296, 2018.
  49. Ming Tan. Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents, pp.  487–494. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1997. ISBN 1558604952.
  50. Revisiting parameter sharing in multi-agent deep reinforcement learning. arXiv preprint arXiv:2005.13625, 2020.
  51. Attention is all you need. In Advances in neural information processing systems, pp. 5998–6008, 2017.
  52. Starcraft ii: A new challenge for reinforcement learning. arXiv preprint arXiv:1708.04782, 2017.
  53. Qplex: Duplex dueling multi-agent q-learning. arXiv preprint arXiv:2008.01062, 2020.
  54. Tom2c: Target-oriented multi-agent communication and cooperation with theory of mind, 2021.
  55. Unmasking the inductive biases of unsupervised object representations for video sequences, 2020.
  56. Zhijie Xie and SH Song. Fedkl: Tackling data heterogeneity in federated reinforcement learning by penalizing kl divergence. arXiv preprint arXiv:2204.08125, 2022.
  57. The surprising effectiveness of ppo in cooperative, multi-agent games. arXiv preprint arXiv:2103.01955, 2021.
  58. Coordinating multi-agent reinforcement learning with limited communication. In Proceedings of the 2013 International Conference on Autonomous Agents and Multi-Agent Systems, AAMAS ’13, pp.  1101–1108, Richland, SC, 2013. International Foundation for Autonomous Agents and Multiagent Systems. ISBN 9781450319935.
  59. Magent: A many-agent reinforcement learning platform for artificial collective intelligence. CoRR, abs/1712.00600, 2017. URL http://arxiv.org/abs/1712.00600.
  60. Pareto-optimal nash equilibrium: Sufficient conditions and existence in mixed strategies. Automation and Remote Control, 77(8):1500–1510, 2016.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Dianbo Liu (59 papers)
  2. Vedant Shah (14 papers)
  3. Oussama Boussif (6 papers)
  4. Cristian Meo (13 papers)
  5. Anirudh Goyal (93 papers)
  6. Tianmin Shu (44 papers)
  7. Michael Mozer (17 papers)
  8. Nicolas Heess (139 papers)
  9. Yoshua Bengio (601 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.