Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning and Sustaining Shared Normative Systems via Bayesian Rule Induction in Markov Games (2402.13399v2)

Published 20 Feb 2024 in cs.AI
Learning and Sustaining Shared Normative Systems via Bayesian Rule Induction in Markov Games

Abstract: A universal feature of human societies is the adoption of systems of rules and norms in the service of cooperative ends. How can we build learning agents that do the same, so that they may flexibly cooperate with the human institutions they are embedded in? We hypothesize that agents can achieve this by assuming there exists a shared set of norms that most others comply with while pursuing their individual desires, even if they do not know the exact content of those norms. By assuming shared norms, a newly introduced agent can infer the norms of an existing population from observations of compliance and violation. Furthermore, groups of agents can converge to a shared set of norms, even if they initially diverge in their beliefs about what the norms are. This in turn enables the stability of the normative system: since agents can bootstrap common knowledge of the norms, this leads the norms to be widely adhered to, enabling new entrants to rapidly learn those norms. We formalize this framework in the context of Markov games and demonstrate its operation in a multi-agent environment via approximately Bayesian rule induction of obligative and prohibitive norms. Using our approach, agents are able to rapidly learn and sustain a variety of cooperative institutions, including resource management norms and compensation for pro-social labor, promoting collective welfare while still allowing agents to act in their own interests.

Exploring the Bayesian Landscape of Norm Learning in Multi-Agent Systems

Norm Learning Through Bayesian Inference

In the domain of artificial intelligence, particularly within multi-agent systems, the development of agents that can seamlessly integrate and cooperate within human societal structures is of paramount interest. A paper has advanced our understanding in this field by demonstrating how agents can infer, adhere to, and sustain shared normative systems through a Bayesian approach, anchored within the context of Markov games. This paper, titled "Learning and Sustaining Shared Normative Systems via Bayesian Rule Induction in Markov Games," presents a novel framework that encapsulates the assumptions of shared normativity for rapid norm learning, enabling agents to deduce the norms of existing groups purely from observations of actions deemed compliant or violative.

Formal Framework

The paper introduces Norm-Augmented Markov Games (NMGs), expanding traditional Markov games to include social norms as functions that classify actions into compliant or non-compliant categories. Agents, under this framework, adjust their beliefs about shared norms based on their observations, employing Bayesian inference to update these beliefs continuously. This system essentially equips agents with the ability to learn norms by watching and interpreting the actions of others within the game, adapting their strategies to align with perceived shared normative systems.

Empirical Evaluations

Extensive simulations were conducted to explore various dimensions of norm learning and social coordination among agents. Highlighted results include:

  • Passive Norm Learning showed agents could rapidly absorb norms reflective of those practiced by experienced agents, demonstrating the model's efficiency in capturing the dynamics of communal normativity through observation alone.
  • In the sphere of Norm-Enabled Social Outcomes, the findings illustrated that adherence to certain norms substantially improved collective welfare and environmental sustainability, underpinning the significance of shared norms in promoting cooperative behavior.
  • The experiment's focus on Intergenerational Norm Transmission revealed the potential for norms to be maintained across generations of agents, suggesting a viable pathway for the sustained common knowledge of norms in evolving agent communities.
  • Lastly, Norm Emergence and Convergence exhibited that agents could bootstrap a shared set of norms from scratch, aligning over time through a process informed by mutual observations and individual exploratory actions.

Theoretical and Practical Implications

This investigation sheds light on the underlying mechanisms through which learning agents can decipher and conform to communal norms, thereby enhancing their integration within human societies. It proposes a robust model for understanding the decentralized learning and sustenance of norms, extending the theoretical groundwork for future AI systems designed for seamless human-agent cooperation. Practical applications span diverse domains where multi-agent systems interact closely with human environments, requiring adherence to shared societal rules and standards.

Future Directions

The paper opens several avenues for future research, among them the exploration of how sanctions influence norm sustenance and learning, the interplay between model-free and model-based learning in understanding and generating norm-compliant behavior, and the development of agents capable of normative reasoning and adaptation. The potential integration of LLMs for norm representation and reasoning also presents an intriguing frontier, suggesting a melding of symbolic rule-based approaches with the latest in language understanding models.

Closing Thoughts

The assimilation of social norms by learning agents signifies a leap towards more adaptable, intelligent, and socially aware AI. The framework and findings detailed in this paper not only advance our understanding of how such systems can learn and sustain norms but also lay the groundwork for their practical implementation in complex, multifaceted human environments. The journey toward creating agents that can intelligently navigate the social fabric of human societies is fraught with challenges, yet studies like this illuminate the path forward, promising a future where AI seamlessly integrates into the tapestry of human social structures.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (103)
  1. Melting Pot 2.0. http://arxiv.org/abs/2211.13746 arXiv:2211.13746 [cs] version: 3.
  2. Carlos E. Alchourrón. 1969. Logic of Norms and Logic of Normative Propositions. Logique et Analyse 12, 47 (1969), 242–268. https://www.jstor.org/stable/44083577 Publisher: Peeters Publishers.
  3. Joan Aldous and Reuben Hill. 1965. Social Cohesion, Lineage Type, and Intergenerational Transmission*. Social Forces 43, 4 (May 1965), 471–482. https://doi.org/10.2307/2574453
  4. Thomas Arnold and Daniel Kasenberg. 2017. Value Alignment or Misalignment: What Will Keep Systems Accountable?. In AAAI Workshop on AI, Ethics, and Society.
  5. Robert J. Aumann. 1987. Correlated Equilibrium as an Expression of Bayesian Rationality. Econometrica 55, 1 (1987), 1–18. https://doi.org/10.2307/1911154 Publisher: [Wiley, Econometric Society].
  6. Robert Axelrod and William D Hamilton. 1981. The evolution of cooperation. science 211, 4489 (1981), 1390–1396.
  7. Alisabeth Ayars. 2016. Can model-free reinforcement learning explain deontological moral judgments? Cognition 150 (2016), 232–242.
  8. Constitutional AI: Harmlessness from AI feedback. arXiv preprint arXiv:2212.08073 (2022).
  9. Learning to act using real-time dynamic programming. Artificial Intelligence 72, 1 (Jan. 1995), 81–138. https://doi.org/10.1016/0004-3702(94)00011-O
  10. Vern Bengtson. 2018. Global Aging and Challenges to Families. Routledge.
  11. Cristina Bicchieri. 2005. The Grammar of Society: The Nature and Dynamics of Social Norms. Cambridge University Press. Google-Books-ID: 4N1FDIZvcI8C.
  12. Cristina Bicchieri and Yoshitaka Fukui. 1999. The great illusion: Ignorance, informational cascades, and the persistence of unpopular norms. In Experience, Reality, and Scientific Explanation: Essays in Honor of Merrilee and Wesley Salmon. Springer, 89–121.
  13. Social Norms. In The Stanford Encyclopedia of Philosophy (winter 2018 ed.), Edward N. Zalta (Ed.). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/win2018/entries/social-norms/
  14. Ken Binmore and Larry Samuelson. 1994. An economist’s perspective on the evolution of norms. Journal of Institutional and Theoretical Economics (JITE)/Zeitschrift für die gesamte Staatswissenschaft (1994), 45–63.
  15. Guido Boella and Leendert van der Torre. 2006. An architecture of a normative system: counts-as conditionals, obligations and permissions. In Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems (AAMAS ’06). Association for Computing Machinery, New York, NY, USA, 229–231. https://doi.org/10.1145/1160633.1160671
  16. Introduction to normative multiagent systems. Computational & Mathematical Organization Theory 12, 2 (Oct. 2006), 71–79. https://doi.org/10.1007/s10588-006-9537-7
  17. Bowles. 2006. Group Competition, Reproductive Leveling, and the Evolution of Human Altruism | Science. https://www.science.org/doi/abs/10.1126/science.1134829
  18. Michael E. Bratman. 2013. Shared Agency: A Planning Theory of Acting Together. Oxford University Press. Google-Books-ID: jcs8BAAAQBAJ.
  19. John J Camilleri. 2017. Contracts and Computation. Doctoral. University of Gothenburg, Gothenburg.
  20. Get It in Writing: Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL. https://doi.org/10.48550/arXiv.2208.10469 arXiv:2208.10469 [cs, econ].
  21. A Bayesian Approach to Norm Identification. (2016).
  22. Fiery Cushman. 2013. Action, outcome, and value: A dual-system framework for morality. Personality and social psychology review 17, 3 (2013), 273–292.
  23. Punishment as communication. The Oxford handbook of moral psychology (2019), 197–209.
  24. Incapacitation and just deserts as motives for punishment. Law and human behavior 24 (2000), 659–683.
  25. A theory of learning to infer. Psychological review 127, 3 (2020), 412.
  26. Contemporary Approaches to the Social Contract. In The Stanford Encyclopedia of Philosophy (winter 2021 ed.), Edward N. Zalta (Ed.). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/win2021/entries/contractarianism-contemporary/
  27. Benjamin Eysenbach and Sergey Levine. 2019. If MaxEnt RL is the Answer, What is the Question? https://arxiv.org/abs/1910.01913v1
  28. Ernst Fehr and Ivo Schurtenberger. 2018. Normative foundations of human cooperation. Nature Human Behaviour 2, 7 (July 2018), 458–468. https://doi.org/10.1038/s41562-018-0385-5
  29. Wolfgang Gaissmaier and Lael J Schooler. 2008. The smart potential behind probability matching. Cognition 109, 3 (2008), 416–422.
  30. Herbert Gintis. 2010. Social norms as choreography. Politics, Philosophy & Economics 9, 3 (Aug. 2010), 251–264. https://doi.org/10.1177/1470594X09345474 Publisher: SAGE Publications.
  31. P. J. Gmytrasiewicz and P. Doshi. 2005. A Framework for Sequential Planning in Multi-Agent Settings. Journal of Artificial Intelligence Research 24 (July 2005), 49–79. https://doi.org/10.1613/jair.1579
  32. The Role of Family in the Intergenerational Transmission of Collective Action. Social Psychological and Personality Science 12, 6 (Aug. 2021), 856–867. https://doi.org/10.1177/1948550620949378 Publisher: SAGE Publications Inc.
  33. A Rational Analysis of Rule-Based Concept Learning. Cognitive Science 32, 1 (2008), 108–154. https://doi.org/10.1080/03640210701802071 _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1080/03640210701802071.
  34. Concepts in a probabilistic language of thought. Technical Report. Center for Brains, Minds and Machines (CBMM).
  35. Gillian Kereldena Hadfield. 2017. Rules for a flat world: Why humans invented law and how to reinvent it for a complex global economy. Oxford University Press.
  36. Gillian K. Hadfield and Barry R. Weingast. 2012. What Is Law? A Coordination Model of the Characteristics of Legal Order. Journal of Legal Analysis 4, 2 (Dec. 2012), 471–514. https://doi.org/10.1093/jla/las008
  37. Gillian K. Hadfield and Barry R. Weingast. 2014. Microfoundations of the Rule of Law. Annual Review of Political Science 17, 1 (2014), 21–42. https://doi.org/10.1146/annurev-polisci-100711-135226 _eprint: https://doi.org/10.1146/annurev-polisci-100711-135226.
  38. Legible Normativity for AI Alignment: The Value of Silly Rules. http://arxiv.org/abs/1811.01267 arXiv:1811.01267 [cs].
  39. Kurtis Hagen. 2010. The propriety of Confucius: A sense-of-ritual. Asian Philosophy 20, 1 (2010), 1–25.
  40. The Emergence of Social Norms and Conventions. Trends in Cognitive Sciences 23, 2 (Feb. 2019), 158–169. https://doi.org/10.1016/j.tics.2018.11.003
  41. Feature-based Joint Planning and Norm Learning in Collaborative Games. (2016).
  42. Inequity aversion improves cooperation in intertemporal social dilemmas. In Advances in Neural Information Processing Systems, Vol. 31. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2018/hash/7fea637fd6d02b8f0adf6f7dc36aed93-Abstract.html
  43. Lab Experiments for the Study of Social-Ecological Systems. Science 328, 5978 (April 2010), 613–617. https://doi.org/10.1126/science.1183532 Publisher: American Association for the Advancement of Science.
  44. Social influence as intrinsic motivation for multi-agent deep reinforcement learning. In International conference on machine learning. PMLR, 3040–3049.
  45. Ehud Kalai and Ehud Lehrer. 1995. Subjective games and equilibria. Games and Economic Behavior 8, 1 (1995), 123–163.
  46. Daniel Kasenberg and Matthias Scheutz. 2018. Inverse Norm Conflict Resolution. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society (AIES ’18). Association for Computing Machinery, New York, NY, USA, 178–183. https://doi.org/10.1145/3278721.3278775
  47. Lawrence Kohlberg and Richard H Hersh. 1977. Moral development: A review of the theory. Theory into practice 16, 2 (1977), 53–59.
  48. When it is not out of line to get out of line: The role of universalization and outcome-based reasoning in rule-breaking judgments. 45, 45 (2023).
  49. Model-free conventions in multi-agent reinforcement learning with heterogeneous preferences. http://arxiv.org/abs/2010.09054 arXiv:2010.09054 [cs].
  50. Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot. http://arxiv.org/abs/2107.06857 arXiv:2107.06857 [cs].
  51. Adam Lerer and Alexander Peysakhovich. 2019. Learning Existing Social Conventions via Observationally Augmented Self-Play. http://arxiv.org/abs/1806.10071 arXiv:1806.10071 [cs].
  52. Resource-rational contractualism: A triple theory of moral cognition. https://doi.org/10.31234/osf.io/p48t7
  53. The logic of universalization guides moral judgment. Proceedings of the National Academy of Sciences 117, 42 (2020), 26158–26169.
  54. Role-Based Modeling for Designing Agent Behavior in Self-Organizing Multi-Agent Systems. International Journal of Software Engineering and Knowledge Engineering 28, 01 (Jan. 2018), 79–96. https://doi.org/10.1142/S0218194018500043 Publisher: World Scientific Publishing Co.
  55. Young children conform more to norms than to preferences. Plos one 16, 5 (2021), e0251228.
  56. Divide-and-conquer with sequential Monte Carlo. Journal of Computational and Graphical Statistics 26, 2 (2017), 445–458.
  57. Michael L. Littman. 1994. Markov games as a framework for multi-agent reinforcement learning. In Machine Learning Proceedings 1994, William W. Cohen and Haym Hirsh (Eds.). Morgan Kaufmann, San Francisco (CA), 157–163. https://doi.org/10.1016/B978-1-55860-335-6.50027-1
  58. Lang2LTL: Translating Natural Language Commands to Temporal Robot Task Specification. arXiv preprint arXiv:2302.11649 (2023).
  59. Sample-Efficient Reinforcement Learning of Partially Observable Markov Games. https://doi.org/10.48550/arXiv.2206.01315 arXiv:2206.01315 [cs, stat].
  60. John Mikhail. 2007. Universal moral grammar: Theory, evidence and the future. Trends in cognitive sciences 11, 4 (2007), 143–152.
  61. Adam Morris and Fiery Cushman. 2018. A common framework for theories of norm compliance. Social Philosophy and Policy 35, 1 (2018), 101–127.
  62. Norm emergence in multiagent systems: a viewpoint paper. Autonomous Agents and Multi-Agent Systems 33, 6 (Nov. 2019), 706–749. https://doi.org/10.1007/s10458-019-09422-0
  63. Stephen Muggleton. 1994. Bayesian Inductive Logic Programming. In Machine Learning Proceedings 1994, William W. Cohen and Haym Hirsh (Eds.). Morgan Kaufmann, San Francisco (CA), 371–379. https://doi.org/10.1016/B978-1-55860-335-6.50052-0
  64. John J. Nay. 2022. Law Informs Code: A Legal Informatics Approach to Aligning Artificial Intelligence with Humans. https://doi.org/10.48550/arXiv.2209.13020 arXiv:2209.13020 [cs].
  65. Rational learners and moral rules. Mind & Language 31, 5 (2016), 530–554.
  66. Social norms as solutions. Science 354, 6308 (Oct. 2016), 42–43. https://doi.org/10.1126/science.aaf8317 Publisher: American Association for the Advancement of Science.
  67. LINC: A neurosymbolic approach for logical reasoning by combining language models with first-order logic provers. arXiv preprint arXiv:2310.15164 (2023).
  68. Elinor Ostrom. 1990. Governing the commons: The evolution of institutions for collective action. Cambridge university press.
  69. Diana Panke and Ulrich Petersohn. 2012. Why international norms disappear sometimes. European journal of international relations 18, 4 (2012), 719–742.
  70. Computational approaches to habits in a model-free world. Current Opinion in Behavioral Sciences 20 (2018), 104–109.
  71. Steven T Piantadosi and Robert A Jacobs. 2016. Four problems solved by the probabilistic language of thought. Current Directions in Psychological Science 25, 1 (2016), 54–59.
  72. A multi-agent reinforcement learning model of common-pool resource appropriation. In Advances in Neural Information Processing Systems, Vol. 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2017/hash/2b0f658cbffd284984fb11d90254081f-Abstract.html
  73. Modeling punishment as a rational communicative social action. In Proceedings of the annual meeting of the cognitive science society, Vol. 44.
  74. Young children’s understanding of the context-relativity of normative rules in conventional games. British Journal of Developmental Psychology 27, 2 (2009), 445–456. https://doi.org/10.1348/026151008X337752 _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1348/026151008X337752.
  75. The sources of normativity: Young children’s awareness of the normative structure of games. Developmental Psychology 44, 3 (2008), 875–881. https://psycnet.apa.org/doiLanding?doi=10.1037%2F0012-1649.44.3.875
  76. A Tutorial on Thompson Sampling. Foundations and Trends® in Machine Learning 11, 1 (July 2018), 1–96. https://doi.org/10.1561/2200000070 Publisher: Now Publishers, Inc.
  77. Identifying prohibition norms in agent societies. Artificial Intelligence and Law 21, 1 (March 2013), 1–46. https://doi.org/10.1007/s10506-012-9126-7
  78. Norm emergence in agent societies formed by dynamically changing networks. Web Intelligence and Agent Systems: An International Journal 7, 3 (Jan. 2009), 223–232. https://doi.org/10.3233/WIA-2009-0164 Publisher: IOS Press.
  79. T. M. Scanlon. 2000. What We Owe to Each Other. Harvard University Press. Google-Books-ID: 9OPsDwAAQBAJ.
  80. Young children attribute normativity to novel actions without pedagogy or normative language. Developmental Science 14, 3 (2011), 530–539. https://doi.org/10.1111/j.1467-7687.2010.01000.x _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1467-7687.2010.01000.x.
  81. Ryosuke Shibusawa and Toshiharu Sugawara. 2014. Norm Emergence via Influential Weight Propagation in Complex Networks. 2014 European Network Intelligence Conference (Sept. 2014), 30–37. https://doi.org/10.1109/ENIC.2014.28 Conference Name: 2014 European Network Intelligence Conference (ENIC) ISBN: 9781479969142 Place: Wroclaw, Poland Publisher: IEEE.
  82. John Maynard Smith. 1982. Evolution and the Theory of Games. Cambridge University Press.
  83. Stephanie Stacy. 2022. The Imagined We: Shared Bayesian Theory of Mind for Modeling Communication. Ph.D. Dissertation. Los Angeles. https://www.proquest.com/openview/795eae7dc98cb5364a1643c68c56481d/1?pq-origsite=gscholar&cbl=18750&diss=y
  84. Kim-Pong Tam. 2015. Understanding Intergenerational Cultural Transmission Through the Role of Perceived Norms. Journal of Cross-Cultural Psychology 46, 10 (Nov. 2015), 1260–1266. https://doi.org/10.1177/0022022115600074 Publisher: SAGE Publications Inc.
  85. Toshiyuki Tanaka. 1998. A Theory of Mean Field Approximation. In Advances in Neural Information Processing Systems, Vol. 11. MIT Press. https://proceedings.neurips.cc/paper_files/paper/1998/hash/a368b0de8b91cfb3f91892fbf1ebd4b2-Abstract.html
  86. Bootstrapping an Imagined We for Cooperation. (2011).
  87. Bootstrapping an Imagined We for Cooperation.. In CogSci.
  88. Michael Tomasello and Malinda Carpenter. 2007. Shared intentionality. Developmental science 10, 1 (2007), 121–125.
  89. Gisela Trommsdorff. 2005. Parent–Child Relations Over the Lifespan: A Cross-Cultural Perspective. In Parenting Beliefs, Behaviors, and Parent-Child Relations. Psychology Press. Num Pages: 42.
  90. Raimo Tuomela. 1995. The Importance of Us: A Philosophical Study of Basic Social Notions. Stanford University Press, Stanford, Calif.
  91. Edna Ullmann-Margalit. 2015. The emergence of norms. OUP Oxford.
  92. A learning agent that acquires social norms from public sanctions in decentralized multi-agent settings. Collective Intelligence 2, 2 (April 2023), 26339137231162025. https://doi.org/10.1177/26339137231162025 Publisher: SAGE Publications.
  93. Georg Henrik Von Wright. 1981. On the logic of norms and actions. In New studies in deontic logic: Norms, actions, and the foundations of ethics. Springer, 3–35.
  94. Meta-learning MCMC proposals. Advances in neural information processing systems 31 (2018).
  95. Richard A. Watson and Eörs Szathmáry. 2016. How Can Evolution Learn? Trends in Ecology & Evolution 31, 2 (Feb. 2016), 147–157. https://doi.org/10.1016/j.tree.2015.11.009 Publisher: Elsevier.
  96. Michael P Wellman and Max Henrion. 1993. Explaining’explaining away’. IEEE Transactions on Pattern Analysis and Machine Intelligence 15, 3 (1993), 287–292.
  97. From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought. http://arxiv.org/abs/2306.12672 arXiv:2306.12672 [cs].
  98. Too Many Cooks: Bayesian Inference for Coordinating Multi‐Agent Collaboration. Topics in Cognitive Science 13, 2 (April 2021), 414–432. https://doi.org/10.1111/tops.12525
  99. A tale of three probabilistic families: Discriminative, descriptive, and generative models. Quart. Appl. Math. 77, 2 (2019), 423–465.
  100. Tan Zhi-Xuan. 2022. What Should AI Owe To Us? Accountable and Aligned AI Systems via Contractualist AI Alignment. (2022). https://www.lesswrong.com/posts/Cty2rSMut483QgBQ2/what-should-ai-owe-to-us-accountable-and-aligned-ai-systems
  101. That’s Mine! Learning Ownership Relations and Norms for Robots. http://arxiv.org/abs/1812.02576 arXiv:1812.02576 [cs].
  102. Pragmatic Instruction Following and Goal Assistance via Cooperative Language Guided Inverse Plan Search. In Proceedings of the 23rd International Conference on Autonomous Agents and MultiAgent Systems.
  103. Integrating bottom-up/top-down for object recognition by data driven Markov chain Monte Carlo. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No. PR00662), Vol. 1. IEEE, 738–745.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Ninell Oldenburg (4 papers)
  2. Tan Zhi-Xuan (22 papers)
Citations (1)