Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Mediated Multi-Agent Reinforcement Learning (2306.08419v1)

Published 14 Jun 2023 in cs.MA, cs.GT, and cs.LG

Abstract: The majority of Multi-Agent Reinforcement Learning (MARL) literature equates the cooperation of self-interested agents in mixed environments to the problem of social welfare maximization, allowing agents to arbitrarily share rewards and private information. This results in agents that forgo their individual goals in favour of social good, which can potentially be exploited by selfish defectors. We argue that cooperation also requires agents' identities and boundaries to be respected by making sure that the emergent behaviour is an equilibrium, i.e., a convention that no agent can deviate from and receive higher individual payoffs. Inspired by advances in mechanism design, we propose to solve the problem of cooperation, defined as finding socially beneficial equilibrium, by using mediators. A mediator is a benevolent entity that may act on behalf of agents, but only for the agents that agree to it. We show how a mediator can be trained alongside agents with policy gradient to maximize social welfare subject to constraints that encourage agents to cooperate through the mediator. Our experiments in matrix and iterative games highlight the potential power of applying mediators in MARL.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. Bowen Baker. 2020. Emergent reciprocity and team formation from randomized uncertain social preferences. Advances in Neural Information Processing Systems 33 (2020), 15786–15799.
  2. Adaptive mechanism design: Learning to promote cooperation. In 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–7.
  3. On the private provision of public goods. Journal of public economics 29, 1 (1986), 25–49.
  4. Jan Blumenkamp and Amanda Prorok. 2021. The emergence of adversarial communication in multi-agent reinforcement learning. In Conference on Robot Learning. PMLR, 1394–1414.
  5. Deep coordination graphs. In International Conference on Machine Learning. PMLR, 980–991.
  6. Convex optimization. Cambridge university press.
  7. Multi-receiver online bayesian persuasion. In International Conference on Machine Learning. PMLR, 1314–1323.
  8. Get It in Writing: Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL. In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems (London, United Kingdom) (AAMAS ’23). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 448–456.
  9. David Cittern and Abbas Edalat. 2015. Reinforcement Learning for Nash Equilibrium Generation.. In AAMAS. 1727–1728.
  10. Vincent Conitzer. 2019. Designing preferences, beliefs, and identities for artificial intelligence. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 9755–9759.
  11. Privacy and truthful equilibrium selection for aggregative games. In International Conference on Web and Internet Economics. Springer, 286–299.
  12. Balancing individual preferences and shared objectives in multiagent reinforcement learning. Good Systems-Published Research (2020).
  13. Learning reciprocity in complex sequential social dilemmas. arXiv preprint arXiv:1903.08082 (2019).
  14. Learning with Opponent-Learning Awareness. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems. 122–130.
  15. Counterfactual multi-agent policy gradients. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.
  16. Sanford J Grossman and Oliver D Hart. 1992. An analysis of the principal-agent problem. Springer.
  17. Multiagent planning with factored MDPs. In Advances in neural information processing systems. 1523–1530.
  18. Cooperative multi-agent control using deep reinforcement learning. In International conference on autonomous agents and multiagent systems. Springer, 66–83.
  19. Inequity aversion improves cooperation in intertemporal social dilemmas. In Advances in neural information processing systems. 3326–3336.
  20. Reward redistribution mechanisms in multi-agent reinforcement learning. In Adaptive Learning Agents Workshop at the International Conference on Autonomous Agents and Multiagent Systems.
  21. Balancing Rational and Other-Regarding Preferences in Cooperative-Competitive Environments. In Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems. 1536–1538.
  22. Optimal-er Auctions through Attention. In Advances in Neural Information Processing Systems, Vol. 35.
  23. Social influence as intrinsic motivation for multi-agent deep reinforcement learning. In International Conference on Machine Learning. PMLR, 3040–3049.
  24. Jiechuan Jiang and Zongqing Lu. 2019. Learning fairness in multi-agent systems. Advances in Neural Information Processing Systems 32 (2019).
  25. Emir Kamenica and Matthew Gentzkow. 2011. Bayesian persuasion. American Economic Review 101, 6 (2011), 2590–2615.
  26. Mechanism design in large games: Incentives and privacy. In Proceedings of the 5th conference on Innovations in theoretical computer science. 403–410.
  27. Robust mediators in large games. arXiv preprint arXiv:1512.02698 (2015).
  28. Iterated Reasoning with Mutual Information in Cooperative and Byzantine Decentralized Teaming. In International Conference on Learning Representations.
  29. Vijay R Konda and John N Tsitsiklis. 2000. Actor-critic algorithms. In Advances in neural information processing systems. 1008–1014.
  30. Adam Lerer and Alexander Peysakhovich. 2017. Maintaining cooperation in complex social dilemmas using deep reinforcement learning. arXiv preprint arXiv:1707.01068 (2017).
  31. Information Design in Multi-Agent Reinforcement Learning. arXiv preprint arXiv:2305.06807 (2023).
  32. Michael L Littman. 1994. Markov games as a framework for multi-agent reinforcement learning. In Machine learning proceedings 1994. Elsevier, 157–163.
  33. Multi-agent actor-critic for mixed cooperative-competitive environments. Advances in neural information processing systems 30 (2017).
  34. Asynchronous methods for deep reinforcement learning. In International conference on machine learning. 1928–1937.
  35. Dov Monderer and Moshe Tennenholtz. 2009. Strong mediated equilibrium. Artificial Intelligence 173, 1 (2009), 180–195.
  36. Emergent Communication under Competition. In Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems. 974–982.
  37. Similarity-based Cooperation. arXiv preprint arXiv:2211.14468 (2022).
  38. David C Parkes and Michael P Wellman. 2015. Economic reasoning and artificial intelligence. Science 349, 6245 (2015), 267–272.
  39. Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow. In International Conference on Learning Representations.
  40. A multi-agent reinforcement learning model of common-pool resource appropriation. Advances in Neural Information Processing Systems 30 (2017).
  41. Alexander Peysakhovich and Adam Lerer. 2018a. Consequentialist conditional cooperation in social dilemmas with imperfect information. In International Conference on Learning Representations.
  42. Alexander Peysakhovich and Adam Lerer. 2018b. Prosocial learning agents solve generalized stag hunts better than selfish ones. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems. International Foundation for Autonomous Agents and Multiagent Systems, 2043–2044.
  43. Emergent Cooperation from Mutual Acknowledgment Exchange. In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems. 1047–1055.
  44. Prisoner’s dilemma: A study in conflict and cooperation. Vol. 165. University of Michigan press.
  45. Ryan M Rogers and Aaron Roth. 2014. Asymptotically truthful equilibrium selection in large congestion games. In Proceedings of the fifteenth ACM conference on Economics and computation. 771–782.
  46. Ariel Rubinstein. 1998. Modeling bounded rationality. MIT press.
  47. Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.
  48. Moshe Tennenholtz. 2004. Program equilibrium. Games and Economic Behavior 49, 2 (2004), 363–373.
  49. Evolving intrinsic motivations for altruistic behavior. In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems. International Foundation for Autonomous Agents and Multiagent Systems, 683–692.
  50. COLA: consistent learning with opponent-learning awareness. In International Conference on Machine Learning. PMLR, 23804–23831.
  51. Learning to incentivize other learning agents. Advances in Neural Information Processing Systems 33 (2020), 15208–15219.
  52. CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning. In International Conference on Learning Representations.
  53. Fully decentralized multi-agent reinforcement learning with networked agents. In International Conference on Machine Learning. PMLR, 5872–5881.
  54. Proximal Learning With Opponent-Learning Awareness. Advances in Neural Information Processing Systems 35 (2022).
  55. The ai economist: Improving equality and productivity with ai-driven tax policies. arXiv preprint arXiv:2004.13332 (2020).
  56. The AI Economist: Taxation policy design via two-level deep multiagent reinforcement learning. Science Advances 8, 18 (2022), eabk2607. https://doi.org/10.1126/sciadv.abk2607
  57. Learning fair policies in decentralized cooperative multi-agent reinforcement learning. In International Conference on Machine Learning. PMLR, 12967–12978.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Dmitry Ivanov (19 papers)
  2. Ilya Zisman (12 papers)
  3. Kirill Chernyshev (2 papers)
Citations (7)