Leveraging Partial Symmetry for Multi-Agent Reinforcement Learning (2401.00167v1)
Abstract: Incorporating symmetry as an inductive bias into multi-agent reinforcement learning (MARL) has led to improvements in generalization, data efficiency, and physical consistency. While prior research has succeeded in using perfect symmetry prior, the realm of partial symmetry in the multi-agent domain remains unexplored. To fill in this gap, we introduce the partially symmetric Markov game, a new subclass of the Markov game. We then theoretically show that the performance error introduced by utilizing symmetry in MARL is bounded, implying that the symmetry prior can still be useful in MARL even in partial symmetry situations. Motivated by this insight, we propose the Partial Symmetry Exploitation (PSE) framework that is able to adaptively incorporate symmetry prior in MARL under different symmetry-breaking conditions. Specifically, by adaptively adjusting the exploitation of symmetry, our framework is able to achieve superior sample efficiency and overall performance of MARL algorithms. Extensive experiments are conducted to demonstrate the superior performance of the proposed framework over baselines. Finally, we implement the proposed framework in real-world multi-robot testbed to show its superiority.
- Exploiting symmetries in reinforcement learning of bimanual robotic tasks. IEEE Robotics and Automation Letters, 4(2): 1838–1845.
- Boutilier, C. 1996. Planning, learning and coordination in multiagent decision processes. In TARK, volume 96, 195–210. Citeseer.
- Geometric deep learning: Grids, groups, graphs, geodesics, and gauges. arXiv preprint arXiv:2104.13478.
- Decentralized multi-agent pursuit using deep reinforcement learning. IEEE Robotics and Automation Letters, 6(3): 4552–4559.
- MACT: Multi-agent Collision Avoidance with Continuous Transition Reinforcement Learning via Mixup. In International Conference on Swarm Intelligence, 74–85. Springer.
- Fillmore, J. P. 1984. A Note on Rotation Matrices. IEEE Computer Graphics and Applications, 4(2): 30–33.
- A kernel method for the two-sample-problem. Advances in neural information processing systems, 19.
- Boosting Multiagent Reinforcement Learning via Permutation Invariant and Permutation Equivariant Networks. In The Eleventh International Conference on Learning Representations.
- Reinforcement Learning with Augmented Data. Advances in Neural Information Processing Systems, 33.
- Curl: Contrastive unsupervised representations for reinforcement learning. In International Conference on Machine Learning, 5639–5650. PMLR.
- Byzantine Robust Cooperative Multi-Agent Reinforcement Learning as a Bayesian Game. arXiv preprint arXiv:2305.12872.
- Invariant transform experience replay: Data augmentation for deep reinforcement learning. IEEE Robotics and Automation Letters, 5(4): 6615–6622.
- Multi-agent actor-critic for mixed cooperative-competitive environments. arXiv preprint arXiv:1706.02275.
- Emergence of Grounded Compositional Language in Multi-Agent Populations. arXiv preprint arXiv:1703.04908.
- Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. In International Conference on Machine Learning, 4295–4304. PMLR. ISBN 2640-3498.
- Symmetries and Model Minimization in Markov Decision Processes. Technical report, University of Massachusetts, Amherst, MA, United States.
- Physics-informed deep learning for traffic state estimation: A hybrid paradigm informed by second-order traffic models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, 540–547.
- A physics-informed deep learning paradigm for traffic state and fundamental diagram estimation. IEEE Transactions on Intelligent Transportation Systems, 23: 11688–11698.
- Improving the on-vehicle experience of passengers through SC-M*: A scalable multi-passenger multi-criteria mobility planner. IEEE Transactions on Intelligent Transportation Systems, 22(2): 1026–1040.
- Multi-Agent MDP Homomorphic Networks. arXiv preprint arXiv:2110.04495.
- MDP homomorphic networks: Group symmetries in reinforcement learning. Advances in Neural Information Processing Systems, 33.
- SO(2)SO2\mathrm{SO}(2)roman_SO ( 2 )-Equivariant Reinforcement Learning. arXiv preprint arXiv:2203.04439.
- Image augmentation is all you need: Regularizing deep reinforcement learning from pixels. In International Conference on Learning Representations.
- Improving sample efficiency in Multi-Agent Actor-Critic methods. Applied Intelligence, 1–14.
- The Surprising Effectiveness of PPO in Cooperative, Multi-Agent Games. arXiv preprint arXiv:2103.01955.
- ESP: Exploiting Symmetry Prior for Multi-Agent Reinforcement Learning. In ECAI 2023, 2946–2953. IOS Press.
- Swarm inverse reinforcement learning for biological systems. In 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 274–279. IEEE.