Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Emergent Dominance Hierarchies in Reinforcement Learning Agents (2401.12258v7)

Published 21 Jan 2024 in cs.MA, cs.AI, cs.GT, and cs.LG

Abstract: Modern Reinforcement Learning (RL) algorithms are able to outperform humans in a wide variety of tasks. Multi-agent reinforcement learning (MARL) settings present additional challenges, and successful cooperation in mixed-motive groups of agents depends on a delicate balancing act between individual and group objectives. Social conventions and norms, often inspired by human institutions, are used as tools for striking this balance. In this paper, we examine a fundamental, well-studied social convention that underlies cooperation in both animal and human societies: dominance hierarchies. We adapt the ethological theory of dominance hierarchies to artificial agents, borrowing the established terminology and definitions with as few amendments as possible. We demonstrate that populations of RL agents, operating without explicit programming or intrinsic rewards, can invent, learn, enforce, and transmit a dominance hierarchy to new populations. The dominance hierarchies that emerge have a similar structure to those studied in chickens, mice, fish, and other species.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (88)
  1. Alberto Alesina and Eliana La Ferrara. 2005. Ethnic diversity and economic performance. Journal of economic literature 43, 3 (2005), 762–800.
  2. Cameron Anderson and Gavin J Kilduff. 2009. The pursuit of status in social groups. Current Directions in Psychological Science 18, 5 (2009), 295–298.
  3. Hierarchical Reinforcement Learning for Ad Hoc Teaming. In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems. 2337–2339.
  4. Modelling Cooperation in Network Games with Spatio-Temporal Complexity. arXiv:2102.06911 [cs.MA]
  5. Age-graded dominance hierarchies and social tolerance in packs of free-ranging dogs. Behavioral Ecology 28, 4 (2017), 1004–1020.
  6. Samuel Bowles. 2009. Did warfare among ancestral hunter-gatherers affect the evolution of human social behaviors? Science 324, 5932 (2009), 1293–1298.
  7. Robert Boyd and Peter J Richerson. 2009. Culture and the evolution of human cooperation. Philosophical Transactions of the Royal Society B: Biological Sciences 364, 1533 (2009), 3281–3288.
  8. Aggressive interactions and inter-contest interval: how long do winners keep winning?. Animal Behaviour 48, 2 (1994), 393–400.
  9. Networks never rest: an investigation of network evolution in three species of animals. Social Networks 68 (2022), 356–373.
  10. Individual differences versus social dynamics in the formation of animal dominance hierarchies. Proceedings of the National Academy of Sciences 99, 8 (2002), 5744–5749.
  11. Dominance in humans. Philosophical Transactions of the Royal Society B 377, 1845 (2022), 20200451.
  12. Vincent Conitzer. 2006. Computational aspects of preference aggregation. Ph.D. Dissertation. Carnegie Mellon University.
  13. Vincent Conitzer and Caspar Oesterheld. 2023. Foundations of cooperative AI. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 15359–15367.
  14. Iain D Couzin. 2009. Collective cognition in animal groups. Trends in cognitive sciences 13, 1 (2009), 36–43.
  15. Real world games look like spinning tops. Advances in Neural Information Processing Systems 33 (2020), 17443–17454.
  16. Cooperative AI: machines must learn to find common ground. Nature 593, 7857 (2021), 33–36.
  17. Open Problems in Cooperative AI. arXiv:2012.08630 [cs.AI]
  18. Bruno O David and Rick J Stoffels. 2003. Spatial organisation and behavioural interaction of giant kokopu (Galaxias argenteus) in two stream pools differing in fish density. (2003).
  19. Han De Vries. 1995. An improved test of linearity in dominance hierarchies containing unknown or tied relationships. Animal Behaviour 50, 5 (1995), 1375–1389.
  20. Han De Vries. 1998. Finding a dominance order most consistent with a linear hierarchy: a new procedure and review. Animal behaviour 55, 4 (1998), 827–843.
  21. Reinhard Diestel. 2017. Graph Theory (5 ed.). Springer.
  22. Improving Factuality and Reasoning in Language Models through Multiagent Debate. arXiv preprint arXiv:2305.14325 (2023).
  23. Statistical discrimination in learning agents. arXiv preprint arXiv:2110.11404 (2021).
  24. Lee Alan Dugatkin. 1997. Winner and loser effects and the structure of dominance hierarchies. Behavioral Ecology 8, 6 (1997), 583–587.
  25. Lee Alan Dugatkin. 2001. Bystander effects and the structure of dominance hierarchies. Behavioral Ecology 12, 3 (2001), 348–352.
  26. Play behavior in wolves: Using the ‘50:50’ rule to test for egalitarian play styles. PLoS One 11, 5 (2016), e0154150.
  27. Learning with Opponent-Learning Awareness. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems (Stockholm, Sweden) (AAMAS ’18). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 122–130.
  28. Dynamics of alliance formation and the egalitarian revolution. PLoS One 3, 10 (2008), e3293.
  29. J. Goodall. 1986. The Chimpanzees of Gombe: Patterns of Behavior. Belknap Press of Harvard University Press. https://books.google.co.il/books?id=eloQAQAAMAAJ
  30. Deborah M Gordon. 1996. The organization of work in social insect colonies. Nature 380, 6570 (1996), 121–124.
  31. Sebastian Grueneisen and Michael Tomasello. 2017. Children coordinate in a recurrent social dilemma by taking turns and along dominance asymmetries. Developmental psychology 53, 2 (2017), 265.
  32. Status conferral in intergroup social dilemmas: behavioral antecedents and consequences of prestige and dominance. Journal of personality and social psychology 102, 2 (2012), 351.
  33. Dynamic programming for partially observable stochastic games. In AAAI, Vol. 4. 709–715.
  34. Charlotte K Hemelrijk. 1996. Dominance interactions, spatial dynamics and emergent reciprocity in a virtual world. In Proceedings of the fourth international conference on simulation of adaptive behavior, Vol. 4. Citeseer, 545–552.
  35. Charlotte K Hemelrijk. 2000. Social phenomena emerging by self-organisation in a competitive, virtual world (“DomWorld”). In Learning to behave Workshop II: Internalising knowledge. Ieper, Belgium. 11–19.
  36. The self-organization of social complexity in group-living animals: Lessons from the DomWorld model. In Advances in the Study of Behavior. Academic Press, 361–405.
  37. S Peter Henazi and Louise Barrett. 1999. The value of grooming to female primates. Primates 40 (1999), 47–59.
  38. DeepFoids: Adaptive Bio-Inspired Fish Simulation with Deep Reinforcement Learning. Advances in Neural Information Processing Systems 35 (2022), 18377–18389.
  39. Geoffrey J Iverson and Donald Stone Sade. 1990. Statistical issues in the analysis of dominance hierarchies in animal societies. Journal of Quantitative Anthropology 2, 1 (1990), 61–83.
  40. Social influence as intrinsic motivation for multi-agent deep reinforcement learning. In International conference on machine learning. PMLR, 3040–3049.
  41. Keith Johnstone. 1981. Impro: Improvisation and the Theatre. Routledge. 33–74 pages.
  42. Elon Kohlberg and Jean-Francois Mertens. 1986. On the strategic stability of equilibria. Econometrica: Journal of the Econometric Society (1986), 1003–1037.
  43. Sarit Kraus. 1997. Negotiation and cooperation in multi-agent environments. Artificial Intelligence 94, 1 (1997), 79–97. https://doi.org/10.1016/S0004-3702(97)00025-8 Economic Principles of Multi-Agent Systems.
  44. Multi-agent Reinforcement Learning in Sequential Social Dilemmas. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems (São Paulo, Brazil) (AAMAS ’17). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 464–473.
  45. Olof Leimar. 2021. The evolution of social dominance through reinforcement learning. The American Naturalist 197, 5 (2021), 560–575.
  46. Multimodal Foundation Models: From Specialists to General-Purpose Assistants. arXiv:2309.10020 [cs.CV]
  47. RLlib: Abstractions for Distributed Reinforcement Learning. arXiv:1712.09381 [cs.AI]
  48. Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate. arXiv preprint arXiv:2305.19118 (2023).
  49. Model-free opponent shaping. In International Conference on Machine Learning. PMLR, 14398–14411.
  50. Joe C Magee and Adam D Galinsky. 2008. Social hierarchy: The self-reinforcing nature of power and status. The Academy of Management Annals 2 (2008), 351–398. Issue 1.
  51. Henry Mintzberg. 1989. The structuring of organizations. Springer.
  52. Igor Mordatch and Pieter Abbeel. 2018. Emergence of grounded compositional language in multi-agent populations. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.
  53. Michael Muthukrishna and Joseph Henrich. 2016. Innovation in the collective brain. Philosophical Transactions of the Royal Society B: Biological Sciences 371, 1690 (2016).
  54. Auctions and bidding: A guide for computer scientists. ACM Computing Surveys (CSUR) 43, 2 (2011), 1–59.
  55. How dominance hierarchies emerge from conflict: A game theoretic model and experimental evidence. Social science research 86 (2020), 102393.
  56. Emergent Dominance Hierarchies in Reinforcement Learning Agents. In Proceedings of the 2024 International Conference on Autonomous Agents and Multiagent Systems (Auckland, New Zealand) (AAMAS ’24). International Foundation for Autonomous Agents and Multiagent Systems.
  57. Towards a Better Understanding of Learning with Multiagent Teams. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence. ijcai.org, 271–279. https://doi.org/10.24963/ijcai.2023/31
  58. Trust-based mechanisms for robust and efficient task allocation in the presence of execution uncertainty. Journal of Artificial Intelligence Research 35 (2009), 119–159.
  59. Anatol Rapoport. 1949. Outline of a probabilistic approach to animal sociology: I. The bulletin of mathematical biophysics 11 (1949), 183–196.
  60. Anatol Rapoport and Albert M. Chammah. 1966. The Game of Chicken. American Behavioral Scientist 10, 3 (1966), 10–28. https://doi.org/10.1177/000276426601000303 arXiv:https://doi.org/10.1177/000276426601000303
  61. Byron Reeves and Clifford Nass. 1996. The media equation: How people treat computers, television, and new media like real people. Cambridge, UK 10, 10 (1996).
  62. Continuity and change in dominance relations among female baboons. Animal Behaviour 35, 3 (1987), 785–793.
  63. Nathan J Sanders and Deborah M Gordon. 2003. Resource-dependent interactions and the organization of desert ant communities. Ecology 84, 4 (2003), 1024–1031.
  64. Robert M Sapolsky. 2005. The influence of social hierarchy on primate health. science 308, 5722 (2005), 648–652.
  65. Thorleif Schjelderup-Ebbe. 1922. Beiträge zur Sozialpsychologie des Haushuhns.[Observation on the social psychology of domestic fowls.]. Zeitschrift für Psychologie und Physiologie der Sinnesorgane. Abt. 1. Zeitschrift für Psychologie 88 (1922), 225.
  66. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).
  67. J Maynard Smith and George R Price. 1973. The logic of animal conflict. Nature 246, 5427 (1973), 15–18.
  68. DomArchive: a century of published dominance data. Philosophical Transactions of the Royal Society B 377, 1845 (2022), 20200436.
  69. Eli D Strauss and Daizaburo Shizuka. 2022. The dynamics of dominance: open questions, challenges and solutions. Philosophical Transactions of the Royal Society B 377, 1845 (2022), 20200445.
  70. Richard S Sutton and Andrew G Barto. 2018. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA.
  71. Learning Robust Real-Time Cultural Transmission without Human Data. arXiv preprint arXiv:2203.00715 (2022).
  72. Larissa Z Tiedens and Alison R Fragale. 2003. Power moves: complementarity in dominant and submissive nonverbal behavior. Journal of personality and social psychology 84, 3 (2003), 558.
  73. Bill Tomlinson. 2009. A Proximate Mechanism for Communities of Agents to Commemorate Long Dead Ancestors. Journal of Artificial Societies and Social Simulation 12, 1 (2009), 7. https://www.jasss.org/12/1/7.html
  74. How is an agent like a wolf?: Dominance and submission in multi-agent systems. In International ICSC Symposium on Multi-Agents and Mobile Agents in Virtual Organizations and E-Commerce (MAMA’2000).
  75. William Michael Tomlinson. 2002. Synthetic social relationships for computational entities. Ph.D. Dissertation. Massachusetts Institute of Technology.
  76. Generative agent-based modeling with actions grounded in physical, social, or digital space using Concordia. arXiv:2312.03664 [cs.AI]
  77. A learning agent that acquires social norms from public sanctions in decentralized multi-agent settings. Collective Intelligence 2, 2 (2023). https://doi.org/10.1177/26339137231162025 arXiv:https://doi.org/10.1177/26339137231162025
  78. The dynamics of men’s cooperation and social status in a small-scale society. Proceedings of the Royal Society B 286, 1908 (2019), 20191367.
  79. John R Watson. 1970. Dominance-subordination in caged groups of house sparrows. The Wilson Bulletin (1970), 268–278.
  80. Social context-dependent relationships between mouse dominance rank and plasma hormone levels. Physiology & behavior 171 (2017), 110–119.
  81. G Wittemyer and Wayne M Getz. 2007. Hierarchical dominance structure and social organization in African elephants, Loxodonta africana. Animal Behaviour 73, 4 (2007), 671–681.
  82. Evidence for a collective intelligence factor in the performance of human groups. Science 330, 6004 (2010), 686–688.
  83. AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework. arXiv:2308.08155 [cs.AI]
  84. The Rise and Potential of Large Language Model Based Agents: A Survey. arXiv:2309.07864 [cs.AI]
  85. Voting-Based Multi-Agent Reinforcement Learning for Intelligent IoT. arXiv:1907.01385 [cs.LG]
  86. Yaodong Yang and Jun Wang. 2020. An overview of multi-agent reinforcement learning from game theoretical perspective. arXiv preprint arXiv:2011.00583 (2020).
  87. Amotz Zahavi and Avishag Zahavi. 1999. The handicap principle: A missing piece of Darwin’s puzzle. Oxford University Press.
  88. Re-examining diversity as a double-edged sword for innovation process. Journal of Organizational Behavior 36, 7 (2015), 1026–1049.

Summary

We haven't generated a summary for this paper yet.