QGNN: Value Function Factorisation with Graph Neural Networks (2205.13005v2)
Abstract: In multi-agent reinforcement learning, the use of a global objective is a powerful tool for incentivising cooperation. Unfortunately, it is not sample-efficient to train individual agents with a global reward, because it does not necessarily correlate with an agent's individual actions. This problem can be solved by factorising the global value function into local value functions. Early work in this domain performed factorisation by conditioning local value functions purely on local information. Recently, it has been shown that providing both local information and an encoding of the global state can promote cooperative behaviour. In this paper we propose QGNN, the first value factorisation method to use a graph neural network (GNN) based model. The multi-layer message passing architecture of QGNN provides more representational complexity than models in prior work, allowing it to produce a more effective factorisation. QGNN also introduces a permutation invariant mixer which is able to match the performance of other methods, even with significantly fewer parameters. We evaluate our method against several baselines, including QMIX-Att, GraphMIX, QMIX, VDN, and hybrid architectures. Our experiments include Starcraft, the standard benchmark for credit assignment; Estimate Game, a custom environment that explicitly models inter-agent dependencies; and Coalition Structure Generation, a foundational problem with real-world applications. The results show that QGNN outperforms state-of-the-art value factorisation baselines consistently.
- A Fleet of Miniature Cars for Experiments in Cooperative Driving. IEEE International Conference Robotics and Automation (ICRA), 2019.
- M. Rabbat and R. Nowak. Distributed optimization in sensor networks. In Third International Symposium on Information Processing in Sensor Networks, 2004. IPSN 2004, pages 20–27, 2004.
- One policy to control them all: Shared modular policies for agent-agnostic control. In International Conference on Machine Learning, pages 4455–4464. PMLR, 2020.
- Analyzing and visualizing multiagent rewards in dynamic and stochastic domains. Autonomous Agents and Multi-Agent Systems, 17(2):320–338, 2008.
- A concise introduction to decentralized POMDPs, volume 1. Springer, 2016.
- Ming Tan. Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the tenth international conference on machine learning, pages 330–337, 1993.
- Value-decomposition networks for cooperative multi-agent learning based on team reward. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, pages 2085–2087, 2018.
- Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. In International Conference on Machine Learning, pages 4295–4304. PMLR, 2018.
- Qtran: Learning to factorize with transformation for cooperative multi-agent reinforcement learning. In International Conference on Machine Learning, pages 5887–5896. PMLR, 2019.
- Graph convolutional value decomposition in multi-agent reinforcement learning. arXiv preprint arXiv:2010.04740, 2020.
- Qplex: Duplex dueling multi-agent q-learning. arXiv preprint arXiv:2008.01062, 2020.
- Qrelation: an agent relation-based approach for multi-agent reinforcement learning value function factorization. In ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 4108–4112, 2022.
- Raca: Relation-aware credit assignment for ad-hoc cooperation in multi-agent deep reinforcement learning. In 2022 International Joint Conference on Neural Networks (IJCNN), pages 1–8, 2022.
- Raphaël Avalos. Exploration and communication for partially observable collaborative multi-agent reinforcement learning. In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, pages 1829–1832, 2022.
- Counterfactual multi-agent policy gradients. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
- Multi-agent actor-critic for mixed cooperative-competitive environments. Advances in neural information processing systems, 30, 2017.
- Graph convolutional reinforcement learning. arXiv preprint arXiv:1810.09202, 2018.
- Multi-agent game abstraction via graph attention neural network. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 7211–7218, 2020.
- Cooperative multi-agent transfer learning with level-adaptive credit assignment. arXiv preprint arXiv:2106.00517, 2021.
- Tarmac: Targeted multi-agent communication. In International Conference on Machine Learning, pages 1538–1546. PMLR, 2019.
- Learning multiagent communication with backpropagation. Advances in neural information processing systems, 29, 2016.
- Ai-qmix: Attention and imagination for dynamic multi-agent reinforcement learning. CoRR, abs/2006.04222, 2020.
- Deep coordination graphs. In Hal Daumé III and Aarti Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 980–991. PMLR, 13–18 Jul 2020.
- Relational inductive biases, deep learning, and graph networks, 2018.
- Dynamic graph cnn for learning on point clouds. Acm Transactions On Graphics (tog), 38(5):1–12, 2019.
- Deep sets. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.
- Andrei Nikolaevich Kolmogorov. On the representation of continuous functions of many variables by superposition of continuous functions of one variable and addition. In Doklady Akademii Nauk, volume 114, pages 953–956. Russian Academy of Sciences, 1957.
- Learned-norm pooling for deep feedforward and recurrent neural networks. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 530–546. Springer, 2014.
- Human-level control through deep reinforcement learning. nature, 518(7540):529–533, 2015.
- Rethinking the implementation tricks and monotonicity constraint in cooperative multi-agent reinforcement learning. 2021.
- The StarCraft Multi-Agent Challenge. CoRR, abs/1902.04043, 2019.
- A hybrid exact algorithm for complete set partitioning. Artificial Intelligence, 230:14–50, 2016.
- Gerhard J. Woeginger. Exact Algorithms for NP-Hard Problems: A Survey, pages 185–207. Springer Berlin Heidelberg, Berlin, Heidelberg, 2003.
- Lloyd S Shapley. 17. A value for n-person games. Princeton University Press, 2016.
- G. G. Lorentz. Metric entropy, widths, and superpositions of functions. The American Mathematical Monthly, 69(6):469–485, 1962.