Learning Multi-Agent Communication with Contrastive Learning (2307.01403v3)
Abstract: Communication is a powerful tool for coordination in multi-agent RL. But inducing an effective, common language is a difficult challenge, particularly in the decentralized setting. In this work, we introduce an alternative perspective where communicative messages sent between agents are considered as different incomplete views of the environment state. By examining the relationship between messages sent and received, we propose to learn to communicate using contrastive learning to maximize the mutual information between messages of a given trajectory. In communication-essential environments, our method outperforms previous work in both performance and learning speed. Using qualitative metrics and representation probing, we show that our method induces more symmetric communication and captures global state information from the environment. Overall, we show the power of contrastive learning and the importance of leveraging messages as encodings for effective communication.
- Unsupervised State Representation Learning in Atari. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://papers.nips.cc/paper/2019/hash/6fb52e71b837628ac16539c1ff911667-Abstract.html.
- Learning representations by maximizing mutual information across views. Advances in neural information processing systems, 32, 2019.
- Empirical Evaluation of Ad Hoc Teamwork in the Pursuit Domain. In AAMAS, pp. 8, 2011.
- On optimal cooperation of knowledge sources - an empirical investigation. Technical report, Boeing Advanced Technology Center, Boeing Computing Services, 1986. URL https://www.cs.utexas.edu/~shivaram/readings/b2hd-BendaJD1986.html.
- On the Utility of Learning about Humans for Human-AI Coordination. In Neural Information Processing Systems. arXiv, 2019. doi: 10.48550/arXiv.1910.05789. URL http://arxiv.org/abs/1910.05789. arXiv:1910.05789 [cs, stat].
- Word-order biases in deep-agent emergent communication. In ACL, May 2019. URL http://arxiv.org/abs/1905.12330. arXiv: 1905.12330.
- A Simple Framework for Contrastive Learning of Visual Representations. In arXiv:2002.05709 [cs, stat], June 2020a. URL http://arxiv.org/abs/2002.05709. arXiv: 2002.05709.
- Big Self-Supervised Models are Strong Semi-Supervised Learners. In arXiv:2006.10029 [cs, stat], October 2020b. URL http://arxiv.org/abs/2006.10029. arXiv: 2006.10029.
- On the Properties of Neural Machine Translation: Encoder-Decoder Approaches. In 8th Workshop on Syntax, Semantics and Structure in Statistical Translation. arXiv, October 2014. doi: 10.48550/arXiv.1409.1259. URL http://arxiv.org/abs/1409.1259. arXiv:1409.1259 [cs, stat].
- Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), volume 1, pp. 539–546. IEEE, 2005.
- Shared experience actor-critic for multi-agent reinforcement learning. Advances in neural information processing systems, 33:10707–10717, 2020.
- Open problems in cooperative ai. arXiv preprint arXiv:2012.08630, 2020.
- TarMAC: Targeted Multi-Agent Communication. In ICML, 2019. URL http://arxiv.org/abs/1810.11187.
- Interpretable agent communication from scratch(with a generic visual processor emerging on the side). In NeurIPS, 2021.
- Unsupervised Visual Representation Learning by Context Prediction. In ICCV, 2015. URL http://arxiv.org/abs/1505.05192.
- Biases for emergent communication in multi-agent reinforcement learning. Advances in neural information processing systems, 32, 2019.
- A density-based algorithm for discovering clusters in large spatial databases with noise. In kdd, volume 96, pp. 226–231, 1996.
- Learning to communicate with deep multi-agent reinforcement learning. Advances in neural information processing systems, 29, 2016.
- Spectral normalisation for deep reinforcement learning: an optimisation perspective. In International Conference on Machine Learning, pp. 3734–3744. PMLR, 2021.
- Deep Learning. MIT Press, 2016. http://www.deeplearningbook.org.
- Emergent linguistic phenomena in multi-agent communication games. In EMNLP, 2019.
- Low-Bandwidth Communication Emerges Naturally in Multi-Agent Learning Systems, December 2020. URL http://arxiv.org/abs/2011.14890. arXiv:2011.14890 [cs].
- Dynamic population-based meta-learning for multi-agent communication with natural language. In Advances in Neural Information Processing Systems, 2021.
- Momentum contrast for unsupervised visual representation learning. In CVPR, pp. 9729–9738, 2020.
- "Other-Play" for Zero-Shot Coordination. In ICML. arXiv, 2020. doi: 10.48550/arXiv.2003.02979. URL http://arxiv.org/abs/2003.02979. arXiv:2003.02979 [cs].
- Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning. In ICML. arXiv, September 2018. doi: 10.48550/arXiv.1810.08647. URL http://arxiv.org/abs/1810.08647. arXiv:1810.08647 [cs, stat].
- Learning Attentional Communication for Multi-Agent Cooperation. In Neural Information Processing Systems. arXiv, November 2018.
- V-Learning – A Simple, Efficient, Decentralized Algorithm for Multiagent RL, October 2021. URL http://arxiv.org/abs/2110.14555. arXiv:2110.14555 [cs, stat].
- Coordinated Multi-Agent Deep Reinforcement Learning for Energy-Aware UAV-Based Big-Data Platforms. Electronics, 10(5):543, January 2021. ISSN 2079-9292. doi: 10.3390/electronics10050543. URL https://www.mdpi.com/2079-9292/10/5/543. Number: 5 Publisher: Multidisciplinary Digital Publishing Institute.
- Supervised contrastive learning. Advances in Neural Information Processing Systems, 33:18661–18673, 2020.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Anurag Koul. ma-gym: Collection of multi-agent environments based on openai gym. https://github.com/koulanurag/ma-gym, 2019.
- Emergent Multi-Agent Communication in the Deep Learning Era, July 2020.
- Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel Input. In ICLR, April 2018. URL http://arxiv.org/abs/1804.03984. arXiv: 1804.03984.
- F2a2: Flexible fully-decentralized approximate actor-critic for cooperative multi-agent reinforcement learning. arXiv preprint arXiv:2004.11145, 2020.
- Learning to ground multi-agent communication with autoencoders. Advances in Neural Information Processing Systems, 34, 2021.
- Multi-agent actor-critic for mixed cooperative-competitive environments. Advances in neural information processing systems, 30, 2017.
- Trajectory diversity for zero-shot coordination. In International Conference on Machine Learning, pp. 7204–7213. PMLR, 2021.
- Contrasting Centralized and Decentralized Critics in Multi-Agent Reinforcement Learning. In AAMAS, pp. 9, 2021.
- Multi-agent reinforcement learning for renewable integration in the electric power grid. In Tackling Climate Change with Machine Learning Workshop at NeurIPS, 2021.
- Asynchronous methods for deep reinforcement learning. In International conference on machine learning, pp. 1928–1937. PMLR, 2016.
- Emergent Communication under Competition. In Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, AAMAS ’21, pp. 974–982, Richland, SC, May 2021. International Foundation for Autonomous Agents and Multiagent Systems. ISBN 978-1-4503-8307-3.
- A concise introduction to decentralized POMDPs. Springer, 2016.
- The StarCraft Multi-Agent Challenge, December 2019. URL http://arxiv.org/abs/1902.04043. arXiv:1902.04043 [cs, stat].
- FaceNet: A Unified Embedding for Face Recognition and Clustering. In Conference on Computer Vision and Pattern Recognition (CVPR), pp. 815–823, June 2015. doi: 10.1109/CVPR.2015.7298682. URL http://arxiv.org/abs/1503.03832. arXiv:1503.03832 [cs].
- Learning when to communicate at scale in multiagent cooperative and competitive tasks. arXiv preprint arXiv:1812.09755, 2018.
- Brian Skyrms. Signals: Evolution, Learning, & Information. Oxford University Press, Oxford, 2010.
- Learning multiagent communication with backpropagation. Advances in neural information processing systems, 29, 2016.
- Reinforcement Learning: An Introduction. MIT Press, 2018. URL http://incompleteideas.net/book/the-book-2nd.html.
- Gerald Tesauro. Td-gammon, a self-teaching backgammon program, achieves master-level play. Neural computation, 6(2):215–219, 1994.
- Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
- Progress in the simulation of emergent communication and language. Adaptive Behavior, 11(1):37–69, 2003.
- Yat Long Lo (8 papers)
- Biswa Sengupta (21 papers)
- Jakob Foerster (100 papers)
- Michael Noukhovitch (9 papers)