Cooperation and Control in Delegation Games (2402.15821v2)
Abstract: Many settings of interest involving humans and machines -- from virtual personal assistants to autonomous vehicles -- can naturally be modelled as principals (humans) delegating to agents (machines), which then interact with each other on their principals' behalf. We refer to these multi-principal, multi-agent scenarios as delegation games. In such games, there are two important failure modes: problems of control (where an agent fails to act in line their principal's preferences) and problems of cooperation (where the agents fail to work well together). In this paper we formalise and analyse these problems, further breaking them down into issues of alignment (do the players have similar preferences?) and capabilities (how competent are the players at satisfying those preferences?). We show -- theoretically and empirically -- how these measures determine the principals' welfare, how they can be estimated using limited observations, and thus how they might be used to help us design more aligned and cooperative AI systems.
- “Do we agree? Measuring the cohesiveness of preferences” In Theory and Decision 80.2 Springer ScienceBusiness Media LLC, 2015, pp. 313–339 DOI: 10.1007/s11238-015-9494-z
- “Occam’s Razor Is Insufficient to Infer the Preferences of Irrational Agents” In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18 Montréal, Canada: Curran Associates Inc., 2018, pp. 5603–5614
- K.J. Arrow, E.W. Barankin and D. Blackwell “5. Admissible Points of Convex Sets” In Contributions to the Theory of Games (AM-28), Volume II Princeton University Press, 1953, pp. 87–92 DOI: 10.1515/9781400881970-006
- “On Nash-Equilibria of Approximation-Stable Games” In Algorithmic Game Theory Springer Berlin Heidelberg, 2010, pp. 78–89 DOI: 10.1007/978-3-642-16170-4˙8
- Kerry Back “Concepts of similarity for utility functions” In Journal of Mathematical Economics 15.2 Elsevier BV, 1986, pp. 129–142 DOI: 10.1016/0304-4068(86)90004-2
- B.Douglas Bernheim “Rationalizable Strategic Behavior” In Econometrica 52.4 JSTOR, 1984, pp. 1007 DOI: 10.2307/1911196
- Nick Bostrom “Superintelligence: Paths, Dangers, Strategies” Oxford University Press, 2014 URL: https://www.ebook.de/de/product/21968826/nick_bostrom_superintelligence.html
- “”Near” Weighted Utilitarian Characterizations of Pareto Optima” In arXiv:2008.10819, 2020 arXiv:2008.10819 [econ.TH]
- Paul Christiano “Clarifying “AI alignment””, AI Alignment, 2018 URL: https://ai-alignment.com/clarifying-ai-alignment-cec47cd69dd6
- “Quantifying Generalization in Reinforcement Learning” In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA 97, Proceedings of Machine Learning Research PMLR, 2019, pp. 1282–1289 URL: http://proceedings.mlr.press/v97/cobbe19a.html
- “Preference elicitation in combinatorial auctions” In Proceedings of the 3rd ACM conference on Electronic Commerce ACM, 2001 DOI: 10.1145/501158.501191
- “Learning Properties in Simulation-Based Games” In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, AAMAS ’23 London, United Kingdom: International Foundation for Autonomous AgentsMultiagent Systems, 2023, pp. 272–280
- “Open Problems in Cooperative AI” In arXiv:2012.08630, 2020 arXiv:2012.08630 [cs.AI]
- Aris Daniilidis “Arrow-Barankin-Blackwell Theorems and Related Results in Cone Duality: A Survey” In Lecture Notes in Economics and Mathematical Systems Springer Berlin Heidelberg, 2000, pp. 119–131 DOI: 10.1007/978-3-642-57014-8˙9
- “On cooperation in multi-agent systems” In The Knowledge Engineering Review 12.3 Cambridge University Press (CUP), 1997, pp. 309–314 DOI: 10.1017/s0269888997003111
- “A Review of Cooperation in Multi-Agent Learning” In arXiv:2312.05162 arXiv, 2023 DOI: 10.48550/ARXIV.2312.05162
- “Elicitation of Preferences” Springer Netherlands, 2000 DOI: 10.1007/978-94-017-1406-8
- “Preference Learning” Springer Berlin Heidelberg, 2011 DOI: 10.1007/978-3-642-14125-6
- “Learning Social Preferences in Games” In Proceedings of the 19th National Conference on Artifical Intelligence, AAAI’04 San Jose, California: AAAI Press, 2004, pp. 226–231
- “Quantifying Differences in Reward Functions” In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021 OpenReview.net, 2021 URL: https://openreview.net/forum?id=LwEQnp6CYev
- “Temptation and Self-Control” In Econometrica 69.6 The Econometric Society, 2001, pp. 1403–1435 DOI: 10.1111/1468-0262.00252
- Evan Hubinger “Clarifying inner alignment terminology”, Alignment Forum, 2020 URL: https://www.alignmentforum.org/posts/SzecSPYxqRa5GCaSF/clarifying-inner-alignment-terminology
- Javier Insa-Cabrera, José-Luis Benacloch-Ayuso and José Hernández-Orallo “On Measuring Social Intelligence: Experiments on Competition and Cooperation” In Artificial General Intelligence Springer Berlin Heidelberg, 2012, pp. 126–135 DOI: 10.1007/978-3-642-35506-6˙14
- “Cooperation in Strategic Games Revisited” In The Quarterly Journal of Economics 128.2 Oxford University Press (OUP), 2013, pp. 917–966 DOI: 10.1093/qje/qjs074
- “Alignment of Language Agents” In arXiv:2103.14659, 2021 arXiv:2103.14659 [cs.AI]
- “Worst-Case Equilibria” In STACS 99 Springer Berlin Heidelberg, 1999, pp. 404–413 DOI: 10.1007/3-540-49116-3˙38
- “Inverse Game Theory: Learning Utilities In succinct Games” In Web and Internet Economics Springer Berlin Heidelberg, 2015, pp. 413–427 DOI: 10.1007/978-3-662-48995-6˙30
- “The Theory of Incentives” Princeton University Press, 2002 DOI: 10.1515/9781400829453
- “Universal Intelligence: A Definition of Machine Intelligence” In Minds and Machines 17.4 Island Press, 2007, pp. 391–444 DOI: 10.1007/s11023-007-9079-x
- “Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot” In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event 139, Proceedings of Machine Learning Research PMLR, 2021, pp. 6187–6199 URL: http://proceedings.mlr.press/v139/leibo21a.html
- Andreu Mas-Colell, Michael D. Whinston and Jerry R. Green “Microeconomic Theory” Oxford University Press, 1995
- David G. Pearce “Rationalizable Strategic Behavior and the Problem of Perfection” In Econometrica 52.4 JSTOR, 1984, pp. 1029 DOI: 10.2307/1911197
- Michael Peters “Competing mechanisms” In Canadian Journal of Economics/Revue canadienne d’économique 47.2 Wiley, 2014, pp. 373–397 DOI: 10.1111/caje.12090
- Anatol Rapoport and Albert M. Chammah “The Game of Chicken” In American Behavioral Scientist 10.3 SAGE Publications, 1966, pp. 10–28 DOI: 10.1177/000276426601000303
- Yara Rizk, Mariette Awad and Edward W. Tunstel “Cooperative Heterogeneous Multi-Robot Systems: A Survey” In ACM Computing Surveys 52.2 Association for Computing Machinery (ACM), 2019, pp. 1–31 DOI: 10.1145/3303848
- Tim Roughgarden “Intrinsic Robustness of the Price of Anarchy” In Journal of the ACM 62.5 Association for Computing Machinery (ACM), 2015, pp. 1–42 DOI: 10.1145/2806883
- Stuart Russell “Human Compatible” Penguin LCC US, 2019 URL: https://www.ebook.de/de/product/35445097/stuart_russell_human_compatible.html
- Metin Sengul, Javier Gimeno and Jay Dial “Strategic Delegation: A Review, Theoretical Integration, and Research Agenda” In Journal of Management 38.1 SAGE Publications, 2011, pp. 375–414 DOI: 10.1177/0149206311424317
- H.A. Simon “Rational choice and the structure of the environment” In Psychological Review 63.2 American Psychological Association (APA), 1956, pp. 129–138 DOI: 10.1037/h0042769
- “STARC: A General Framework For Quantifying Differences Between Reward Functions” In arXiv:2309.15257 arXiv, 2023 DOI: 10.48550/ARXIV.2309.15257
- Jessica Taylor “Quantilizers: A Safer Alternative to Maximizers for Limited Optimization” In Proceedings of the 2016 AAAI/ACM Conference on AI, Ethics, and Society, 2016
- “Alignment for Advanced Machine Learning Systems” In Ethics of Artificial Intelligence Oxford University Press, 2020, pp. 342–382 DOI: 10.1093/oso/9780190905033.003.0013
- Emanuel Tewolde “Game Transformations that preserve Nash Equilibrium sets and/or Best Response sets” In arXiv:2111.00076, 2021 arXiv:2111.00076 [cs.GT]
- “Cooperative Multi-Agent Planning: A Survey” In ACM Computing Surveys 50.6 Association for Computing Machinery (ACM), 2017, pp. 1–32 DOI: 10.1145/3128584
- Raimo Tuomela “Cooperation” Springer Netherlands, 2000 DOI: 10.1007/978-94-015-9594-0
- John Vickers “Delegation and the Theory of the Firm” In The Economic Journal 95 Oxford University Press (OUP), 1985, pp. 138 DOI: 10.2307/2232877
- John von Neumann and Oskar Morgenstern “Theory of Games and Economic Behavior” Princeton University Press, 1944
- “Analyzing Complex Strategic Interactions in Multi-Agent Systems” In Game Theoretic and Decision Theoretic Agents Workshop at AAAI, 2002
- Kevin Waugh, Brian D. Ziebart and J.Andrew Bagnell “Computational Rationalization: The Inverse Equilibrium Problem” In Proceedings of the 28th International Conference on International Conference on Machine Learning, ICML’11 Bellevue, Washington, USA: Omnipress, 2011, pp. 1169–1176
- Michael P. Wellman “Methods for Empirical Game-Theoretic Analysis” In Proceedings of the Twenty-First AAAI Conference on Artificial Intelligence, AAAI’06 Boston, Massachusetts: AAAI Press, 2006, pp. 1552–1555
- S.A. West, A.S. Griffin and A. Gardner “Social semantics: altruism, cooperation, mutualism, strong reciprocity and group selection” In Journal of Evolutionary Biology 20.2 Wiley, 2007, pp. 415–432 DOI: 10.1111/j.1420-9101.2006.01258.x
- “Evidence for a Collective Intelligence Factor in the Performance of Human Groups” In Science 330.6004 American Association for the Advancement of Science (AAAS), 2010, pp. 686–688 DOI: 10.1126/science.1193147
- Menahem E Yaari “Rawls, edgeworth, shapley, nash: Theories of distributive justice re-examined” In Journal of Economic Theory 24.1 Elsevier BV, 1981, pp. 1–39 DOI: 10.1016/0022-0531(81)90062-4
- Takuro Yamashita “Mechanism Games with Multiple Principals and Three or More Agents” In Econometrica 78.2 The Econometric Society, 2010, pp. 791–801 DOI: 10.3982/ecta7005
- Lantao Yu, Jiaming Song and Stefano Ermon “Multi-Agent Adversarial Inverse Reinforcement Learning” In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA 97, Proceedings of Machine Learning Research PMLR, 2019, pp. 7194–7201 URL: http://proceedings.mlr.press/v97/yu19e.html
- Oliver Sourbut (3 papers)
- Lewis Hammond (18 papers)
- Harriet Wood (1 paper)