Deconstructing Cooperation and Ostracism via Multi-Agent Reinforcement Learning (2310.04623v1)
Abstract: Cooperation is challenging in biological systems, human societies, and multi-agent systems in general. While a group can benefit when everyone cooperates, it is tempting for each agent to act selfishly instead. Prior human studies show that people can overcome such social dilemmas while choosing interaction partners, i.e., strategic network rewiring. However, little is known about how agents, including humans, can learn about cooperation from strategic rewiring and vice versa. Here, we perform multi-agent reinforcement learning simulations in which two agents play the Prisoner's Dilemma game iteratively. Each agent has two policies: one controls whether to cooperate or defect; the other controls whether to rewire connections with another agent. This setting enables us to disentangle complex causal dynamics between cooperation and network rewiring. We find that network rewiring facilitates mutual cooperation even when one agent always offers cooperation, which is vulnerable to free-riding. We then confirm that the network-rewiring effect is exerted through agents' learning of ostracism, that is, connecting to cooperators and disconnecting from defectors. However, we also find that ostracism alone is not sufficient to make cooperation emerge. Instead, ostracism emerges from the learning of cooperation, and existing cooperation is subsequently reinforced due to the presence of ostracism. Our findings provide insights into the conditions and mechanisms necessary for the emergence of cooperation with network rewiring.
- Erol Akçay. 2018. Collapse and rescue of cooperation in evolving dynamic networks. Nat. Commun. 9, 1 (July 2018), 1–9.
- R Axelrod and W D Hamilton. 1981. The evolution of cooperation. Science 211, 4489 (March 1981), 1390–1396.
- Emergent communication through negotiation. arXiv preprint arXiv:1804.03980 (2018).
- Caroline Claus and Craig Boutilier. 1998. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems. In Proceedings of the Fifteenth National/Tenth Conference on Artificial Intelligence/Innovative Applications of Artificial Intelligence (Madison, Wisconsin, USA) (AAAI ’98/IAAI ’98). American Association for Artificial Intelligence, USA, 746–752.
- Robyn M Dawes. 1980. Social Dilemmas. Annu. Rev. Psychol. 31, 1 (Jan. 1980), 169–193.
- Learning to cooperate in multi-agent social dilemmas. In Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems (Hakodate, Japan) (AAMAS ’06). Association for Computing Machinery, New York, NY, USA, 783–785.
- Co-evolution of behaviour and social network structure promotes human cooperation. Ecol. Lett. 14, 6 (April 2011), 546–551.
- Gossip and Ostracism Promote Cooperation in Groups. Psychol. Sci. 25, 3 (Jan. 2014), 656–664.
- Learning to communicate with deep multi-agent reinforcement learning. Advances in neural information processing systems 29 (2016).
- Reputation-based partner choice promotes cooperation in social networks. Physical Review E 78, 2 (Aug. 2008), 026117–026118.
- Garrett Hardin. 1968. The Tragedy of the Commons. Science 162, 3859 (1968), 1243–1248.
- In Search of Homo Economicus: Behavioral Experiments in 15 Small-Scale Societies. Economic and social behavior 91, 2 (May 2001), 73–78.
- Evolution of extortion in Iterated Prisoner’s Dilemma games. Proc. Natl. Acad. Sci. U. S. A. 110, 17 (April 2013), 6913–6918.
- David Hirshleifer and Eric Rasmusen. 1989. Cooperation in a repeated prisoners dilemma with ostracism. Jornal of Economic Behavior and Organization 12 (April 1989), 87–106.
- Acme: A Research Framework for Distributed Reinforcement Learning. arXiv preprint arXiv:2006.00979 (2020). https://arxiv.org/abs/2006.00979
- Inequity aversion improves cooperation in intertemporal social dilemmas. Advances in neural information processing systems 31 (2018).
- Coordination and cooperation in asymmetric commons dilemmas. Exp. Econ. 14, 4 (Nov. 2011), 547–566.
- Spiros Kapetanakis and Daniel Kudenko. 2002. Reinforcement Learning of Coordination in Cooperative Multi-Agent Systems. In Eighteenth National Conference on Artificial Intelligence (Edmonton, Alberta, Canada). American Association for Artificial Intelligence, USA, 326–331.
- Heterogeneous knowledge transfers via hierarchical teaching in cooperative multiagent reinforcement learning. In Proc of the 19th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2020). 1–8.
- Spurious normativity enhances learning of compliance and enforcement behavior in artificial agents. Proc. Natl. Acad. Sci. U. S. A. 119, 3 (Jan. 2022).
- Multi-agent Reinforcement Learning in Sequential Social Dilemmas. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems (São Paulo, Brazil) (AAMAS ’17). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 464–473.
- Evolution of cooperation on temporal networks. Nat. Commun. 11, 1 (April 2020), 1–9.
- Andrei Lupu and Doina Precup. 2020. Gifting in Multi-Agent Reinforcement Learning. In Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS ’20). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 789–797.
- Resilient cooperators stabilize long-run cooperation in the finitely repeated Prisoner’s Dilemma. Nat. Commun. 8, 1 (Jan. 2017), 13800.
- Social Networks and Collective Action: A Theory of the Critical Mass. III. Am. J. Sociol. 94, 3 (Nov. 1988), 502–534.
- Naoki Masuda. 2012. Evolution of cooperation driven by zealots. Sci. Rep. 2 (Sept. 2012).
- Scaffolding cooperation in human groups with deep reinforcement learning. Nat Hum Behav (Sept. 2023).
- The coevolution of choosiness and cooperation. Nature 451, 7175 (Jan. 2008), 189–192.
- Cooperation, clustering, and assortative mixing in dynamic networks. Proc. Natl. Acad. Sci. U. S. A. 115, 5 (Jan. 2018), 951–956.
- Inequality and visibility of wealth in experimental social networks. Nature 526, 7573 (Jan. 2015), 426–429.
- M Nowak and K Sigmund. 1993. A strategy of win-stay, lose-shift that outperforms tit-for-tat in the Prisoner’s Dilemma game. Nature 364, 6432 (July 1993), 56–58.
- Martin A Nowak. 2006. Evolutionary Dynamics. Harvard University Press.
- A simple rule for the evolution of cooperation on graphs and social networks. Nature 441, 7092 (May 2006), 502–505.
- Mancur Olson. 1965. The Logic of Collective Action: Public Goods and the Theory of Groups. Harvard University Press.
- Deep decentralized multi-task multi-agent reinforcement learning under partial observability. In International Conference on Machine Learning. PMLR, 2681–2690.
- Elinor Ostrom. 2000. Collective Action and the Evolution of Social Norms. J. Econ. Perspect. 14, 3 (Sept. 2000), 137–158.
- Matjaž Perc and Attila Szolnoki. 2010. Coevolutionary games—A mini review. Biosystems. 99, 2 (Feb. 2010), 109–125.
- Dynamic social networks promote cooperation in experiments with humans. Proc. Natl. Acad. Sci. U. S. A. 108, 48 (Jan. 2011), 19193–19198.
- David G Rand and Martin A Nowak. 2013. Human cooperation. Trends Cogn. Sci. 17, 8 (Aug. 2013), 413–425.
- Temporal difference and return optimism in cooperative multi-agent reinforcement learning. In AAMAS ALA Workshop.
- Social diversity promotes the emergence of cooperation in public goods games. Nature 454, 7201 (July 2008), 213–216.
- Prioritized Experience Replay. arXiv:1511.05952 [cs.LG]
- Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).
- Hirokazu Shirado and Nicholas A Christakis. 2020. Network Engineering Using Autonomous Agents Increases Cooperation in Human Groups. iScience 23, 9 (Sept. 2020), 101438.
- Quality versus quantity of social ties in experimental cooperative networks. Nat. Commun. 4 (Nov. 2013), 2814.
- Karl Tuyls and Ann Nowé. 2005. Evolutionary game theory and multi-agent reinforcement learning. The Knowledge Engineering Review 20, 1 (2005), 63–90.
- Deep Reinforcement Learning with Double Q-learning. arXiv:1509.06461 [cs.LG]
- Atsushi Ueshima (1 paper)
- Shayegan Omidshafiei (34 papers)
- Hirokazu Shirado (5 papers)