Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Emergent Cooperation under Uncertain Incentive Alignment (2401.12646v1)

Published 23 Jan 2024 in cs.MA, cs.AI, and cs.GT

Abstract: Understanding the emergence of cooperation in systems of computational agents is crucial for the development of effective cooperative AI. Interaction among individuals in real-world settings are often sparse and occur within a broad spectrum of incentives, which often are only partially known. In this work, we explore how cooperation can arise among reinforcement learning agents in scenarios characterised by infrequent encounters, and where agents face uncertainty about the alignment of their incentives with those of others. To do so, we train the agents under a wide spectrum of environments ranging from fully competitive, to fully cooperative, to mixed-motives. Under this type of uncertainty we study the effects of mechanisms, such as reputation and intrinsic rewards, that have been proposed in the literature to foster cooperation in mixed-motives environments. Our findings show that uncertainty substantially lowers the agents' ability to engage in cooperative behaviour, when that would be the best course of action. In this scenario, the use of effective reputation mechanisms and intrinsic rewards boosts the agents' capability to act nearly-optimally in cooperative environments, while greatly enhancing cooperation in mixed-motive environments as well.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. Cooperation and Reputation Dynamics with Reinforcement Learning. In AAMAS ’21: 20th International Conference on Autonomous Agents and Multiagent Systems, Virtual Event, United Kingdom, May 3-7, 2021, Frank Dignum, Alessio Lomuscio, Ulle Endriss, and Ann Nowé (Eds.). ACM, 115–123. https://doi.org/10.5555/3463952.3463972
  2. Partner selection for the emergence of cooperation in multi-agent systems using reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 7047–7054.
  3. Uncertainty and cooperation: Analytical results and a simulated agent society. Journal of Artificial Societies and Social Simulation (2006).
  4. James Andreoni. 1988. Why free ride?: Strategies and learning in public goods experiments. Journal of public Economics 37, 3 (1988), 291–304.
  5. Robert Axelrod. 1981. The emergence of cooperation among egoists. American political science review 75, 2 (1981), 306–318.
  6. Andrew G Barto. 2013. Intrinsic motivation and reinforcement learning. Intrinsically motivated learning in natural and artificial systems (2013), 17–47.
  7. Jonathan Bendor. 1993. Uncertainty and the evolution of cooperation. Journal of Conflict resolution 37, 4 (1993), 709–734.
  8. Amir Jalaly Bidgoly and Fereshteh Arabi. 2023. Robustness evaluation of trust and reputation systems using a deep reinforcement learning approach. Computers & Operations Research (2023), 106250.
  9. Anders Biel and Tommy Gärling. 1995. The role of uncertainty in resource dilemmas. Journal of Environmental Psychology 15, 3 (1995), 221–233. https://doi.org/10.1016/0272-4944(95)90005-5 Green Psychology.
  10. Andreas Birk. 2000. Boosting cooperation by evolving trust. Applied Artificial Intelligence 14, 8 (2000), 769–784.
  11. Partner selection supported by opaque reputation promotes cooperative behavior. Judgment and Decision making 11, 6 (2016), 589–600.
  12. Damien Challet and Y-C Zhang. 1997. Emergence of cooperation and organization in an evolutionary game. Physica A: Statistical Mechanics and its Applications 246, 3-4 (1997), 407–418.
  13. The evolution of norms. Journal of theoretical biology 241, 2 (2006), 233–240.
  14. Intrinsically motivated reinforcement learning. Advances in neural information processing systems 17 (2004).
  15. Open Problems in Cooperative AI. https://doi.org/10.48550/ARXIV.2012.08630
  16. Nayana Dasgupta and Mirco Musolesi. 2023. Investigating the Impact of Direct Punishment on the Emergence of Cooperation in Multi-Agent Reinforcement Learning Systems. arXiv preprint arXiv:2301.08278 (2023).
  17. Robyn M Dawes. 1980. Social dilemmas. Annual review of psychology 31, 1 (1980), 169–193.
  18. Common knowledge promotes cooperation in the threshold public goods game by reducing uncertainty. Evolution and Human Behavior 43, 2 (2022), 155–167.
  19. Arthur Dolgopolov. 2022. Reinforcement Learning in a Prisoner’s Dilemma. Available at SSRN 4240842 (2022).
  20. The dynamics of human behavior in the public goods game with institutional incentives. Scientific Reports 6, 1 (2016), 28809.
  21. Faqi Du and Feng Fu. 2011. Partner selection shapes the strategic and topological evolution of cooperation: the power of reputation transitivity. Dynamic Games and Applications 1 (2011), 354–369.
  22. Learning reciprocity in complex sequential social dilemmas. arXiv preprint arXiv:1903.08082 (2019).
  23. Paul R Ehrlich and Simon A Levin. 2005. The evolution of norms. PLoS biology 3, 6 (2005), e194.
  24. Nicholas Emler. 1990. A social psychology of reputation. European review of social psychology 1, 1 (1990), 171–193.
  25. Ayelet Fishbach and Kaitlin Woolley. 2022. The structure of intrinsic motivation. Annual Review of Organizational Psychology and Organizational Behavior 9 (2022), 339–363.
  26. Reputation-based partner choice promotes cooperation in social networks. Phys. Rev. E 78 (Aug 2008), 026117. Issue 2. https://doi.org/10.1103/PhysRevE.78.026117
  27. Trust-based Consensus in Multi-Agent Reinforcement Learning Systems. arXiv preprint arXiv:2205.12880 (2022).
  28. Lata Gangadharan and Veronika Nemes. 2009. Experimental analysis of risk and uncertainty in provisioning private and public goods. Economic Inquiry 47, 1 (2009), 146–164.
  29. Sven Gronauer and Klaus Diepold. 2022. Multi-agent deep reinforcement learning: a survey. Artificial Intelligence Review (2022), 1–49.
  30. Deep Reinforcement Learning Based Dynamic Reputation Policy in 5G Based Vehicular Communication Networks. IEEE Transactions on Vehicular Technology 70, 6 (2021), 6136–6146. https://doi.org/10.1109/TVT.2021.3079379
  31. Social influence as intrinsic motivation for multi-agent deep reinforcement learning. In International conference on machine learning. PMLR, 3040–3049.
  32. Peter Kollock. 1998. Social Dilemmas: The Anatomy of Cooperation. Annual Review of Sociology 24 (1998), 183–214. http://www.jstor.org/stable/223479
  33. Angeliki Lazaridou and Marco Baroni. 2020. Emergent Multi-Agent Communication in the Deep Learning Era. CoRR abs/2006.02419 (2020). arXiv:2006.02419 https://arxiv.org/abs/2006.02419
  34. Multi-agent Reinforcement Learning in Sequential Social Dilemmas. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems. 464–473.
  35. Akihiko Matsui. 1996. On cultural evolution: social norms, rational behavior, and evolutionary game theory. Journal of the Japanese and International Economies 10, 3 (1996), 262–294.
  36. A multi-agent reinforcement learning model of reputation and cooperation in human groups. arXiv:2103.04982 [cs.MA]
  37. Human-level control through deep reinforcement learning. nature 518, 7540 (2015), 529–533.
  38. A computational model of trust and reputation. In Proceedings of the 35th annual Hawaii international conference on system sciences. IEEE, 2431–2439.
  39. Emergence of cooperation and evolutionary stability in finite populations. Nature 428, 6983 (2004), 646–650.
  40. Martin A Nowak and Karl Sigmund. 1998. The dynamics of indirect reciprocity. Journal of theoretical Biology 194, 4 (1998), 561–574.
  41. Hisashi Ohtsuki and Yoh Iwasa. 2004. How should we define goodness?—reputation dynamics in indirect reciprocity. Journal of theoretical biology 231, 1 (2004), 107–120.
  42. Hisashi Ohtsuki and Yoh Iwasa. 2006. The leading eight: social norms that can maintain cooperation by indirect reciprocity. Journal of theoretical biology 239, 4 (2006), 435–444.
  43. Reputation effects in public and private interactions. PLoS computational biology 11, 11 (2015), e1004527.
  44. Gloria Origgi. 2012. A social epistemology of reputation. Social Epistemology 26, 3-4 (2012), 399–418.
  45. Gloria Origgi. 2019. Reputation: What it is and why it matters. Princeton University Press.
  46. Emergent Cooperation and Deception in Public Good Games. https://alaworkshop2023.github.io 2023 Adaptive and Learning Agents Workshop at AAMAS, ALA 2023 ; Conference date: 29-05-2023 Through 30-05-2023.
  47. Elinor Ostrom. 2003. Toward a behavioral theory linking trust, reciprocity, and reputation. Trust and reciprocity: Interdisciplinary lessons from experimental research 6 (2003), 19–79.
  48. Stern-judging: A simple, successful norm which promotes cooperation under indirect reciprocity. PLoS computational biology 2, 12 (2006), e178.
  49. The importance of credo in multiagent learning. arXiv preprint arXiv:2204.07471 (2022).
  50. Dustin Rubenstein and James Kealey. 2010. Cooperation, conflict, and the evolution of complex animal societies. Nat. Educ. Knowl 3 (2010), 78.
  51. Social norms of cooperation in small-scale societies. PLoS computational biology 12, 1 (2016), e1004709.
  52. Thomas C Schelling. 1958. The strategy of conflict. Prospectus for a reorientation of game theory. Journal of Conflict Resolution 2, 3 (1958), 203–264.
  53. A unified framework of direct and indirect reciprocity. Nature Human Behaviour 5, 10 (2021), 1292–1302.
  54. Sandip Sen and Stéphane Airiau. 2007. Emergence of Norms through Social Learning. In IJCAI 2007, Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, January 6-12, 2007, Manuela M. Veloso (Ed.). 1507–1512. http://ijcai.org/Proceedings/07/Papers/243.pdf
  55. Reputation is everything. (2013).
  56. An Open Source Implementation of Sequential Social Dilemma Games. https://github.com/eugenevinitsky/sequential_social_dilemma_games/issues/182. GitHub repository.
  57. A learning agent that acquires social norms from public sanctions in decentralized multi-agent settings. Collective Intelligence 2, 2 (2023).
  58. Thomas Voss. 2001. Game-theoretical perspectives on the emergence of social norms. na.
  59. Christopher JCH Watkins and Peter Dayan. 1992. Q-learning. Machine learning 8 (1992), 279–292.
  60. Bernard L Welch. 1947. The generalization of ‘STUDENT’S’problem when several different population varlances are involved. Biometrika 34, 1-2 (1947), 28–35.
  61. Ronald J Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning 8, 3 (1992), 229–256.
  62. Arjaan Wit and Henk Wilke. 1998. Public good provision under environmental and social uncertainty. European journal of social psychology 28, 2 (1998), 249–256.
  63. Reputation, gossip, and human cooperation. Social and Personality Psychology Compass 10, 6 (2016), 350–364.
  64. Reputation and reciprocity. Physics of Life Reviews (2023).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Nicole Orzan (2 papers)
  2. Erman Acar (27 papers)
  3. Davide Grossi (29 papers)
  4. Roxana Rădulescu (16 papers)
Citations (3)
X Twitter Logo Streamline Icon: https://streamlinehq.com