Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
98 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
52 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
15 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
Gemini 2.5 Flash Deprecated
12 tokens/sec
2000 character limit reached

Emergence of cooperation under punishment: A reinforcement learning perspective (2401.16073v1)

Published 29 Jan 2024 in q-bio.PE, cond-mat.dis-nn, nlin.AO, and physics.soc-ph

Abstract: Punishment is a common tactic to sustain cooperation and has been extensively studied for a long time. While most of previous game-theoretic work adopt the imitation learning where players imitate the strategies who are better off, the learning logic in the real world is often much more complex. In this work, we turn to the reinforcement learning paradigm, where individuals make their decisions based upon their past experience and long-term returns. Specifically, we investigate the Prisoners' dilemma game with Q-learning algorithm, and cooperators probabilistically pose punishment on defectors in their neighborhood. Interestingly, we find that punishment could lead to either continuous or discontinuous cooperation phase transitions, and the nucleation process of cooperation clusters is reminiscent of the liquid-gas transition. The uncovered first-order phase transition indicates that great care needs to be taken when implementing the punishment compared to the continuous scenario.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. R. P. Nielsen, The Academy of Management Review 10, 368 (1985).
  2. A. Colman, Game Theory and Its Applications in the Social and Biological Sciences (Psychology Press, 1995).
  3. M. A. Nowak, Evolutionary Dynamics (Belknap/Harvard, 2006).
  4. E. Pennisi, Science 309, 93 (2005).
  5. M. A. Nowak, Science 314, 1560 (2006b).
  6. R. Dawkins, The Selfish Gene (Oxford University Press, New York, US, 2006).
  7. E. O. Wilson, Sociobiology: The new synthesis (Belknap Press of Harvard U Press, Oxford, England).
  8. M. A. Nowak and K. Sigmund, Nature 393, 573 (1998).
  9. H. Ohtsuki and Y. Iwasa, Journal of Theoretical Biology 239, 435 (2006).
  10. M. A. Nowak and R. M. May, Nature 359, 826 (1992).
  11. G. Szabó and C. Töke, Physical Review E 58, 69 (1998).
  12. J. M. Smith, Nature 201, 1145 (1964).
  13. B. Charlesworth, Heredity 84, 493 (2000).
  14. A. Antonioni and A. Cardillo, Physical Review Letters 118, 238301 (2017).
  15. M. Perc and A. Szolnoki, Physical Review E 77, 011904 (2008).
  16. H. Takesue, Europhysics Letters 121 (2018), 10.1209/0295-5075/121/48005.
  17. A. Szolnoki and M. Perc, Journal of Theoretical Biology 325, 34 (2013a).
  18. A. Diekmann and W. Przepiorka, Scientific Reports 5, 10321 (2015).
  19. E. Fehr and S. Gächter, Nature 415, 137–140 (2002).
  20. K. Sigmund, Trends in Ecology & Evolution 22, 593 (2007).
  21. D. R. Amor and J. Fort, Physical Review E 84, 066115 (2011).
  22. H.-X. Yang and Z. Wang, Europhysics Letters 111, 60003 (2015).
  23. A. Szolnoki and M. Perc, Physical Review X 3, 041021 (2013b).
  24. A. Bandura and R. H. Walters, Social Learning Theory, Vol. 1 (Englewood cliffs Prentice Hall, 1977).
  25. C. J. C. H. Watkins and P. Dayan, Machine Learning 8, 279 (1992).
  26. R. Axelrod and W. D. Hamilton, Science 211, 1390 (1981).
  27. R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction (MIT press, 2018).
  28. G. Volpe, Contemporary Physics 62, 121 (2021).
  29. J. Huang, Physics Reports 564, 1 (2015).
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com