${\rm E}(3)$-Equivariant Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning (2308.11842v3)
Abstract: Identification and analysis of symmetrical patterns in the natural world have led to significant discoveries across various scientific fields, such as the formulation of gravitational laws in physics and advancements in the study of chemical structures. In this paper, we focus on exploiting Euclidean symmetries inherent in certain cooperative multi-agent reinforcement learning (MARL) problems and prevalent in many applications. We begin by formally characterizing a subclass of Markov games with a general notion of symmetries that admits the existence of symmetric optimal values and policies. Motivated by these properties, we design neural network architectures with symmetric constraints embedded as an inductive bias for multi-agent actor-critic methods. This inductive bias results in superior performance in various cooperative MARL benchmarks and impressive generalization capabilities such as zero-shot learning and transfer learning in unseen scenarios with repeated symmetric patterns. The code is available at: https://github.com/dchen48/E3AC.
- Geometric and physical quantities improve e(3) equivariant message passing. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=_xwr8gOBeV1.
- Communication-efficient actor-critic methods for homogeneous markov games. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=xy_2w3J3kH.
- Subequivariant graph reinforcement learning in 3D environments. In Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., and Scarlett, J. (eds.), Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pp. 4545–4565. PMLR, 23–29 Jul 2023. URL https://proceedings.mlr.press/v202/chen23i.html.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pp. 1597–1607. PMLR, 2020.
- Abstract algebra, volume 1999. Prentice Hall Englewood Cliffs, NJ, 1991.
- Generalizing convolutional neural networks for equivariance to lie groups on arbitrary continuous data. In III, H. D. and Singh, A. (eds.), Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pp. 3165–3176. PMLR, 13–18 Jul 2020. URL https://proceedings.mlr.press/v119/finzi20a.html.
- e3nn: Euclidean neural networks, 2022. URL https://arxiv.org/abs/2207.09453.
- Deep Learning. MIT Press, 2016. http://www.deeplearningbook.org.
- Learning from protein structure with geometric vector perceptrons. arXiv preprint arXiv:2009.01411, 2020.
- Reinforcement learning with augmented data. Advances in neural information processing systems, 33:19884–19895, 2020.
- Representation learning on biomolecular structures using equivariant graph attention. In The First Learning on Graphs Conference, 2022. URL https://openreview.net/forum?id=kv4xUo5Pu6.
- Pairwise symmetry reasoning for multi-agent path finding search. Artificial Intelligence, 301:103574, 2021.
- Pic: permutation invariant critic for multi-agent deep reinforcement learning. In Conference on Robot Learning, pp. 590–602. PMLR, 2020.
- Multi-agent actor-critic for mixed cooperative-competitive environments. In Advances in neural information processing systems, pp. 6379–6390, 2017.
- Eqr: Equivariant representations for data-efficient reinforcement learning. In International Conference on Machine Learning, pp. 15908–15926. PMLR, 2022.
- Policy gradient with value function approximation for collective multiagent planning. Advances in neural information processing systems, 30, 2017.
- Equivariant reinforcement learning under partial observability. In 7th Annual Conference on Robot Learning, 2023.
- The effectiveness of data augmentation in image classification using deep learning, 2017.
- Weighted qmix: Expanding monotonic value function factorisation for deep multi-agent reinforcement learning. Advances in neural information processing systems, 33:10199–10210, 2020a.
- Monotonic value function factorisation for deep multi-agent reinforcement learning. The Journal of Machine Learning Research, 21(1):7234–7284, 2020b.
- Symmetries and model minimization in markov decision processes, 2001.
- Continuous mdp homomorphisms and homomorphic policy gradient. In Advances in Neural Information Processing Systems.
- The starcraft multi-agent challenge, 2019.
- E(n) equivariant graph neural networks. In Meila, M. and Zhang, T. (eds.), Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pp. 9323–9332. PMLR, 18–24 Jul 2021. URL https://proceedings.mlr.press/v139/satorras21a.html.
- Equivariant message passing for the prediction of tensorial properties and molecular spectra. In International Conference on Machine Learning, pp. 9377–9388. PMLR, 2021.
- Shapley, L. S. Stochastic games. Proceedings of the national academy of sciences, 39(10):1095–1100, 1953.
- Deepmind control suite. arXiv preprint arXiv:1801.00690, 2018.
- Tensor field networks: Rotation-and translation-equivariant neural networks for 3d point clouds. arXiv preprint arXiv:1802.08219, 2018.
- Mdp homomorphic networks: Group symmetries in reinforcement learning. Advances in Neural Information Processing Systems, 33:4199–4210, 2020.
- Multi-agent MDP homomorphic networks. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=H7HDG--DJF0.
- So (2)-equivariant reinforcement learning. In International Conference on Learning Representations, 2022a.
- SO(2)SO2\mathrm{SO}(2)roman_SO ( 2 )-equivariant reinforcement learning. In International Conference on Learning Representations, 2022b. URL https://openreview.net/forum?id=7F9cOhdvfk_.
- Mean field multi-agent reinforcement learning. In International Conference on Machine Learning, pp. 5571–5580. PMLR, 2018.
- Image augmentation is all you need: Regularizing deep reinforcement learning from pixels. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=GY6-6sTvGaf.
- The surprising effectiveness of ppo in cooperative multi-agent games. Advances in Neural Information Processing Systems, 35:24611–24624, 2022.
- Symmetry teleportation for accelerated optimization. In Oh, A. H., Agarwal, A., Belgrave, D., and Cho, K. (eds.), Advances in Neural Information Processing Systems, 2022a. URL https://openreview.net/forum?id=MHjxpvMzf2x.
- Symmetries, flat minima, and the conserved quantities of gradient flow. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=9ZpciCOunFb.
- Integrating symmetry into differentiable planning with steerable convolutions. In The Eleventh International Conference on Learning Representations, 2022b.