Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Social Environment Design (2402.14090v3)

Published 21 Feb 2024 in cs.AI, econ.GN, q-fin.EC, and stat.ML

Abstract: AI holds promise as a technology that can be used to improve government and economic policy-making. This paper proposes a new research agenda towards this end by introducing Social Environment Design, a general framework for the use of AI for automated policy-making that connects with the Reinforcement Learning, EconCS, and Computational Social Choice communities. The framework seeks to capture general economic environments, includes voting on policy objectives, and gives a direction for the systematic analysis of government and economic policy through AI simulation. We highlight key open problems for future research in AI-based policy-making. By solving these challenges, we hope to achieve various social welfare objectives, thereby promoting more ethical and responsible decision making.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (75)
  1. Melting pot 2.0. arXiv preprint arXiv:2211.13746, 2022.
  2. Arrow, K. J. Social Choice and Individual Values. Yale University Press, 2012. ISBN 978-0-300-17931-6. URL https://www.jstor.org/stable/j.ctt1nqb90.
  3. A trilevel model for best response in energy demand-side management. European Journal of Operational Research, 281(2):299–315, 2020.
  4. Hcmd-zero: Learning value aligned mechanisms from data. arXiv preprint arXiv:2202.10122, 2022.
  5. Fairness and Machine Learning: Limitations and Opportunities. MIT Press, 2023.
  6. Methods for finding leader–follower equilibria with multiple followers. arXiv preprint arXiv:1707.02174, 2017.
  7. Learning equilibria in symmetric auction games using artificial neural networks. Nature machine intelligence, 3(8):687–695, 2021.
  8. Learning equilibria in asymmetric auction games. INFORMS Journal on Computing, 35(3):523–542, 2023.
  9. Deep coordination graphs. In Proceedings of the 37th International Conference on Machine Learning, 2020.
  10. Handbook of computational social choice. Cambridge University Press, 2016.
  11. Learning stackelberg equilibria and applications to economic design games. arXiv preprint arXiv:2210.03852, 2022.
  12. Linear bilevel multi-follower programming with independent followers. Journal of Global Optimization, 39(3):409–417, 2007.
  13. A multi-agent reinforcement learning algorithm based on stackelberg game. In 2017 6th Data Driven Control and Learning Systems (DDCLS), pp.  727–732. IEEE, 2017.
  14. Computing the optimal strategy to commit to. In Proceedings of the 7th ACM conference on Electronic commerce, pp.  82–90, 2006.
  15. Learning solutions in large economic networks using deep multi-agent reinforcement learning. In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, AAMAS ’23, pp.  2760–2762, Richland, SC, 2023. International Foundation for Autonomous Agents and Multiagent Systems. ISBN 9781450394321.
  16. Learning revenue-maximizing auctions with differentiable matching. In International Conference on Artificial Intelligence and Statistics, pp.  6062–6073. PMLR, 2022.
  17. Advancing the Empirical Research on Lobbying. Annual Review of Political Science, 17(1):163–185, 2014. doi: 10.1146/annurev-polisci-100711-135308. URL https://doi.org/10.1146/annurev-polisci-100711-135308. _eprint: https://doi.org/10.1146/annurev-polisci-100711-135308.
  18. A scalable neural network for dsic affine maximizer auction design. arXiv preprint arXiv:2305.12162, 2023.
  19. Optimal auctions through deep learning: Advances in differentiable economics. Journal of the ACM, Forthcoming 2023. First version, ICML 2019, pages 1706–1715. PMLR, 2019.
  20. Good Rationalizations of Voting Rules. In Proceedings of the National Conference on Artificial Intelligence, volume 2, September 2010.
  21. Government by Algorithm: Artificial Intelligence in Federal Administrative Agencies, February 2020. URL https://papers.ssrn.com/abstract=3551505.
  22. Implicit learning dynamics in stackelberg games: Equilibria characterization, convergence analysis, and empirical study. In International Conference on Machine Learning, pp.  3133–3144. PMLR, 2020.
  23. Counterfactual multi-agent policy gradients. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
  24. Mechanism design for defense coordination in security games. In Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems, pp.  402–410, 2020.
  25. Oracles & followers: Stackelberg equilibria in deep multi-agent reinforcement learning. In International Conference on Machine Learning, pp.  11213–11236. PMLR, 2023.
  26. Multiagent planning with factored mdps. In Advances in neural information processing systems, pp.  1523–1530, 2002a.
  27. Coordinated reinforcement learning. In ICML, volume 2, pp.  227–234. Citeseer, 2002b.
  28. Hanson, R. Shall we vote on values, but bet on beliefs? Journal of Political Philosophy, 21(2):151–178, 2013.
  29. Heywood, A. Political Ideologies: An Introduction. Macmillan, 1998. ISBN 9780333698877. URL https://books.google.com/books?id=slulQgAACAAJ.
  30. House, T. W. Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence, October 2023.
  31. Optimal-er auctions through attention. Advances in Neural Information Processing Systems, 35:34734–34747, 2022.
  32. Defender (mis) coordination in security games. In Twenty-Third International Joint Conference on Artificial Intelligence, 2013.
  33. What is local optimality in nonconvex-nonconcave minimax optimization? In International Conference on Machine Learning, pp.  4880–4889. PMLR, 2020.
  34. Incorporating pragmatic reasoning communication into emergent language. Advances in Neural Information Processing Systems, 33:10348–10359, 2020.
  35. Non-linear coordination graphs. Advances in Neural Information Processing Systems, 35:25655–25666, 2022.
  36. Enabling first-order gradient-based learning for equilibrium computation in markets. In International Conference on Machine Learning, pp.  17327–17342. PMLR, 2023.
  37. Human-centred Mechanism Design with Democratic AI. Nature Human Behaviour, 6(10):1398–1407, 2022.
  38. Celebrating diversity in shared multi-agent reinforcement learning. Advances in Neural Information Processing Systems, 34:3991–4002, 2021.
  39. Welfare maximization in competitive equilibrium: Reinforcement learning for Markov exchange economy. In Chaudhuri, K., Jegelka, S., Song, L., Szepesvari, C., Niu, G., and Sabato, S. (eds.), Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pp.  13870–13911. PMLR, 17–23 Jul 2022. URL https://proceedings.mlr.press/v162/liu22l.html.
  40. Multi-agent actor-critic for mixed cooperative-competitive environments. In Advances in Neural Information Processing Systems, pp.  6379–6390, 2017.
  41. Neighborhood cognition consistent multi-agent reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pp.  7219–7226, 2020.
  42. Voluntary participation in cyber-insurance markets. In Workshop on the Economics of Information Security (WEIS), 2014.
  43. Economic reasoning and artificial intelligence. Science, 349(6245):267–272, 2015. doi: 10.1126/science.aaa8403. URL https://www.science.org/doi/abs/10.1126/science.aaa8403.
  44. Patig, S. Measuring expressiveness in conceptual modeling. In International Conference on Advanced Information Systems Engineering, 2004. URL https://api.semanticscholar.org/CorpusID:9715547.
  45. A multi-agent reinforcement learning model of common-pool resource appropriation, 2017.
  46. Procaccia, A. D. Can approximation circumvent gibbard-satterthwaite? In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, AAAI’10, pp.  836–841. AAAI Press, 2010.
  47. Procaccia, A. D. Cake cutting algorithms. In Brandt, F., Conitzer, V., Endriss, U., Lang, J., and Procaccia, A. D. (eds.), Handbook of Computational Social Choice, pp.  311–330. Cambridge University Press, 2016. doi: 10.1017/CBO9781107446984.014.
  48. Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. In International Conference on Machine Learning, pp.  4292–4301, 2018.
  49. Weighted qmix: Expanding monotonic value function factorisation for deep multi-agent reinforcement learning. Advances in Neural Information Processing Systems, 33, 2020.
  50. Sandholm, T. Automated Mechanism Design: A New Application Area for Search Algorithms. In Rossi, F. (ed.), Principles and Practice of Constraint Programming – CP 2003, Lecture Notes in Computer Science, pp.  19–36, Berlin, Heidelberg, 2003. Springer. ISBN 978-3-540-45193-8. doi: 10.1007/978-3-540-45193-8˙2.
  51. High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438, 2015.
  52. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  53. Automated Mechanism Design via Neural Networks, May 2021. URL http://arxiv.org/abs/1805.03382. arXiv:1805.03382 [cs].
  54. Learning expensive coordination: An event-based deep rl approach. In International Conference on Learning Representations, 2019.
  55. M3⁢RLsuperscriptM3RL\text{M}^{3}\text{RL}M start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT RL: Mind-aware multi-agent management reinforcement learning. In Proceedings of the International Conference on Learning Representations (ICLR), 2019.
  56. Learning when to communicate at scale in multiagent cooperative and competitive tasks. In Proceedings of the International Conference on Learning Representations (ICLR), 2019.
  57. Qtran: Learning to factorize with transformation for cooperative multi-agent reinforcement learning. In International Conference on Machine Learning, pp.  5887–5896, 2019.
  58. Value-decomposition networks for cooperative multi-agent learning based on team reward. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, pp.  2085–2087. International Foundation for Autonomous Agents and Multiagent Systems, 2018.
  59. Leader-follower semi-markov decision problems: theoretical framework and approximate solution. In 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, pp.  111–118. IEEE, 2007.
  60. Thomson, W. Introduction to the Theory of Fair Allocation, pp.  261–283. Cambridge University Press, 2016. doi: 10.1017/CBO9781107446984.012.
  61. Coordinating followers to reach better equilibria: End-to-end gradient descent for stackelberg games. arXiv preprint arXiv:2106.03278, 2021a.
  62. Learning nearly decomposable value functions via communication minimization. In International Conference on Learning Representations, 2019.
  63. ROMA: Multi-agent reinforcement learning with emergent roles. In Proceedings of the 37th International Conference on Machine Learning, 2020.
  64. RODE: Learning roles to decompose multi-agent tasks. In Proceedings of the International Conference on Learning Representations (ICLR), 2021b.
  65. Context-aware sparse deep coordination graphs. In International Conference on Learning Representations, 2021c.
  66. Deep contract design via discontinuous piecewise affine neural networks. arXiv preprint arXiv:2307.02318, 2023.
  67. Dop: Off-policy multi-agent decomposed policy gradients. In Proceedings of the International Conference on Learning Representations (ICLR), 2021d.
  68. Multi-agent reinforcement learning is a sequence modeling problem. Advances in Neural Information Processing Systems, 35:16509–16521, 2022.
  69. Modelling bounded rationality in multi-agent interactions by generalized recursive reasoning. arXiv preprint arXiv:1901.09216, 2019a.
  70. Probabilistic recursive reasoning for multi-agent reinforcement learning. In Proceedings of the International Conference on Learning Representations (ICLR), 2019b.
  71. Adaptive incentive design with multi-agent meta-gradient reinforcement learning. arXiv preprint arXiv:2112.10859, 2021.
  72. Self-organized polynomial-time coordination graphs. In International Conference on Machine Learning, pp.  24963–24979. PMLR, 2022.
  73. A general approach to environment design with one agent. In Proceedings of the 21st International Joint Conference on Artificial Intelligence, pp.  2002–2008, 2009.
  74. A multi-leader multi-follower stackelberg game for resource management in lte unlicensed. IEEE Transactions on Wireless Communications, 16(1):348–361, 2016.
  75. The AI Economist: Taxation policy design via two-level deep multiagent reinforcement learning, 2022. URL https://www.science.org/doi/abs/10.1126/sciadv.abk2607.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Edwin Zhang (15 papers)
  2. Sadie Zhao (3 papers)
  3. Tonghan Wang (30 papers)
  4. Safwan Hossain (14 papers)
  5. Henry Gasztowtt (3 papers)
  6. Stephan Zheng (31 papers)
  7. David C. Parkes (81 papers)
  8. Milind Tambe (110 papers)
  9. Yiling Chen (66 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.