Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Steering Language Models with Game-Theoretic Solvers (2402.01704v3)

Published 24 Jan 2024 in cs.CL, cs.AI, and cs.GT
Steering Language Models with Game-Theoretic Solvers

Abstract: Mathematical models of interactions among rational agents have long been studied in game theory. However these interactions are often over a small set of discrete game actions which is very different from how humans communicate in natural language. To bridge this gap, we introduce a framework that allows equilibrium solvers to work over the space of natural language dialogue generated by LLMs. Specifically, by modelling the players, strategies and payoffs in a "game" of dialogue, we create a binding from natural language interactions to the conventional symbolic logic of game theory. Given this binding, we can ask existing game-theoretic algorithms to provide us with strategic solutions (e.g., what string an LLM should generate to maximize payoff in the face of strategic partners or opponents), giving us predictors of stable, rational conversational strategies. We focus on three domains that require different negotiation strategies: scheduling meetings, trading fruit and debate, and evaluate an LLM's generated language when guided by solvers. We see that LLMs that follow game-theory solvers result in dialogue generations that are less exploitable than the control (no guidance from solvers), and the language generated results in higher rewards, in all negotiation domains. We discuss future implications of this work, and how game-theoretic solvers that can leverage the expressivity of natural language can open up a new avenue of guiding language research.

Background and Objectives

The intricate relationship between strategic human interactions and language has long been acknowledged, yet formalizing this relationship within a game-theoretic framework continues to present a unique challenge. This paper introduces a novel approach that integrates the generative capabilities of LLMs with the strategic analysis of game theory, aiming to compute stable, rational strategies in conversational dialogue. By considering language as a strategic tool, LLMs are not only seen as agents capable of realistic dialogue simulations but are also recognized for their utility in generating fresh dialogue scenarios grounded in real-world applications.

Game-Theoretic Integration with LLMs

The paper's central contribution is the establishment of a "binding" from conversational dialogue to the language of game theory, thereby reframing dialogue as a formal game. This opens the door for leveraging existing game-theoretic algorithms to solve strategic interactions represented in the space of language. Furthermore, by drawing on the generative prowess of LLMs, the authors propose and implement generalizations of equilibrium finding algorithms for the dialogue setting. This novel intersection provides insightful perspectives for further algorithm development inherently suited for language and dialogue spaces.

Another significant aspect of the paper is the method through which LLMs aid in quickly synthesizing formal games. This large repository allows for rigorous paper and testing of game-theoretic solution concepts. Crucially, the combination of LLM-driven game generation, game-theoretic solvers, and imitation learning formulates a process to enhance LLMs' strategic capabilities in multi-agent environments.

From Theoretical Framing to Practical Application

Practical considerations are addressed by implementing the theoretical framework into an open-source codebase, chat_games, enabling researchers and practitioners to model their dialogue games and solve them using established game-theoretic solvers. The provision of dialogue as a formal game involves defining actions as strings to influence LLM output and modeling payoffs, which in certain cases directly map onto real-world outcomes, such as monetary value in a business negotiation. This practical translation from theory to application further emphasizes the operational versatility of this approach.

Empirical Strength and Implications

Empirical validation of the proposed methodology demonstrates the potential for strategic improvement of LLMs. The experiments entail using game-theoretic solvers as improvement operators, revealing that the use of algorithms like counterfactual regret minimization (CFR) and policy-space response-oracles leads to policy enhancement when compared against baseline LLM strategies. Extensive testing within the paper's defined domains, such as scheduling meetings or trading fruit, provide robust support for the authors' claims.

In conclusion, this research delineates a valuable intersection between game theory and LLMs, furnishing both a conceptual and practical framework that could redefine how we anticipate and construct strategic behavior in conversational AI. With significant numerical results validating improvement operators in LLMs, the paper offers a profound inception point for future research aimed at optimizing strategic discourse in numerous domains.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (89)
  1. The illusion of artificial inclusion. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems, 2024.
  2. Playing repeated games with large language models. arXiv preprint arXiv:2305.16867, 2023.
  3. Learning to play no-press diplomacy with best response policy iteration. Advances in Neural Information Processing Systems, 33:17987–18003, 2020.
  4. Online learning and solving infinite games with an ERM oracle. In The Thirty Sixth Annual Conference on Learning Theory, pages 274–324. PMLR, 2023.
  5. R. Aumann. Subjectivity and correlation in randomized strategies. Journal of Mathematical Economics, 1(1):67–96, 1974.
  6. Fine-tuning language models to find agreement among humans with diverse preferences. Advances in Neural Information Processing Systems, 35:38176–38189, 2022.
  7. Modeling echo chambers and polarization dynamics in social networks. Physical Review Letters, 124(4), Jan. 2020. ISSN 1079-7114. 10.1103/physrevlett.124.048301. URL http://dx.doi.org/10.1103/PhysRevLett.124.048301.
  8. The Netflix prize. In Proceedings of KDD cup and workshop, volume 2007, page 35. New York, 2007.
  9. S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Computer networks and ISDN systems, 30(1-7):107–117, 1998.
  10. N. Brown and T. Sandholm. Superhuman AI for heads-up no-limit poker: Libratus beats top professionals. Science, 359(6374):418–424, 2018.
  11. Scalable AI safety via doubly-efficient debate. arXiv preprint arXiv:2311.14125, 2023.
  12. Efficient Monte Carlo counterfactual regret minimization in games with many player actions. Advances in neural information processing systems, 25, 2012.
  13. Be selfish, but wisely: Investigating the impact of agent personality in mixed-motive human-agent interactions. arXiv preprint arXiv:2310.14404, 2023.
  14. Put your money where your mouth is: Evaluating strategic planning and execution of LLM agents in an auction arena. arXiv preprint arXiv:2310.05746, 2023.
  15. Cooperative AI: machines must learn to find common ground. Nature, 593(7857):33–36, 2021.
  16. K. Daniel. Thinking, fast and slow. Farrar, Straus and Giroux, 2017.
  17. The complexity of constrained min-max optimization. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 1466–1478, 2021.
  18. Evaluating language model agency through negotiations. arXiv preprint arXiv:2401.04536, 2024.
  19. C. Dogbé. Modeling crowd dynamics by the mean-field limit approach. Mathematical and Computer Modelling, 52(9-10):1506–1520, 2010.
  20. Human-level play in the game of Diplomacy by combining language models with strategic reasoning. Science, 378(6624):1067–1074, 2022. 10.1126/science.ade9097. URL https://www.science.org/doi/abs/10.1126/science.ade9097.
  21. Can large language models serve as rational players in game theory? a systematic analysis. arXiv preprint arXiv:2312.05488, 2023.
  22. Promptbreeder: Self-referential self-improvement via prompt evolution. arXiv preprint arXiv:2309.16797, 2023.
  23. Pragmatics in language grounding: Phenomena, tasks, and modeling approaches. In H. Bouamor, J. Pino, and K. Bali, editors, Findings of the Association for Computational Linguistics: EMNLP 2023, pages 12619–12640, Singapore, Dec. 2023. Association for Computational Linguistics. 10.18653/v1/2023.findings-emnlp.840. URL https://aclanthology.org/2023.findings-emnlp.840.
  24. D. Fudenberg and J. Tirole. Game theory. MIT press, 1991.
  25. Strategic reasoning with language models. arXiv preprint arXiv:2305.19165, 2023.
  26. Developing, evaluating and scaling learning agents in multi-agent environments. AI Communications, (Preprint):1–14, 2022.
  27. Approximating Nash equilibria in normal-form games via stochastic optimization. arXiv preprint arXiv:2310.06689, 2023.
  28. PaLM 2 technical report, 2023.
  29. J. Hannan. Approximation to Bayes risk in repeated play. Contributions to the Theory of Games, 3:97–139, 1957.
  30. Reasoning with language model is planning with world model. arXiv preprint arXiv:2305.14992, 2023.
  31. Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604, 2018.
  32. On agent-mediated electronic commerce. IEEE Transactions on knowledge and data engineering, 15(4):985–1003, 2003.
  33. A survey of learning in multiagent environments: Dealing with non-stationarity. arXiv preprint arXiv:1707.09183, 2017.
  34. B. Heydari and N. Lorè. Strategic behavior of large language models: Game structure vs. contextual framing. Contextual Framing (September 10, 2023), 2023.
  35. J. J. Horton. Large language models as simulated economic agents: What can we learn from homo silicus? Technical report, National Bureau of Economic Research, 2023.
  36. LoRA: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
  37. J. Huang and K. C.-C. Chang. Towards reasoning in large language models: A survey. arXiv preprint arXiv:2212.10403, 2022.
  38. AI safety via debate, 2018.
  39. The consensus game: Language model generation via equilibrium search. arXiv preprint arXiv:2310.09139, 2023.
  40. Language agents as digital representatives in collective decision-making. In NeurIPS 2023 Foundation Models for Decision Making Workshop, 2023.
  41. Automated negotiation: prospects, methods and challenges. International Journal of Group Decision and Negotiation, 10(2):199–215, 2001.
  42. D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  43. Large language models are zero-shot reasoners. Advances in neural information processing systems, 35:22199–22213, 2022.
  44. A. Korinek. Language models and cognitive automation for economic research. Technical report, National Bureau of Economic Research, 2023.
  45. S. Kraus. Negotiation and cooperation in multi-agent environments. Artificial intelligence, 94(1-2):79–97, 1997.
  46. H. W. Kuhn. Extensive games and the problem of information. Annals of Mathematics Studies, 28:193–216, 1953.
  47. Reward design with language models. arXiv preprint arXiv:2303.00001, 2023.
  48. A unified game-theoretic approach to multiagent reinforcement learning. Advances in neural information processing systems, 30, 2017.
  49. OpenSpiel: A framework for reinforcement learning in games. CoRR, abs/1908.09453, 2019. URL http://arxiv.org/abs/1908.09453.
  50. Search-improved game-theoretic multiagent reinforcement learning in general and negotiation games. In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, pages 2445–2447, 2023.
  51. LLM+P: Empowering large language models with optimal planning proficiency. arXiv preprint arXiv:2304.11477, 2023a.
  52. From motor control to team play in simulated humanoid football. Science Robotics, 7(69):eabo0235, 2022.
  53. BOLAA: Benchmarking and orchestrating LLM-augmented autonomous agents. arXiv preprint arXiv:2308.05960, 2023b.
  54. Multi-agent training beyond zero-sum with correlated equilibrium meta-solvers. In International Conference on Machine Learning, pages 7480–7491. PMLR, 2021.
  55. Humans perceive warmth and competence in artificial intelligence. iScience, 26(8), 2023.
  56. Debate helps supervise unreliable experts, 2023.
  57. Learning equilibria in mean-field games: Introducing mean-field PSRO. arXiv preprint arXiv:2111.08350, 2021.
  58. J. Nash. Non-cooperative games. Annals of Mathematics, 54(2):286–295, 1951.
  59. J. F. Nash Jr. The bargaining problem. Econometrica: Journal of the econometric society, pages 155–162, 1950.
  60. Algorithmic Game Theory. Cambridge University Press, 2007.
  61. Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pages 1–22, 2023a.
  62. Ai deception: A survey of examples, risks, and potential solutions. arXiv preprint arXiv:2308.14752, 2023b.
  63. Economic reasoning and artificial intelligence. Science, 349(6245):267–272, 2015.
  64. Game-theoretic vocabulary selection via the shapley value and banzhaf index. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2789–2798, 2021.
  65. Mastering the game of stratego with model-free multiagent reinforcement learning. Science, 378(6623):990–996, 2022. 10.1126/science.add4679. URL https://www.science.org/doi/abs/10.1126/science.add4679.
  66. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140):1–67, 2020. URL http://jmlr.org/papers/v21/20-074.html.
  67. A. Rosenfeld and S. Kraus. Strategical argumentative agent for human persuasion. In ECAI 2016, pages 320–328. IOS Press, 2016.
  68. J. S. Rosenschein and G. Zlotkin. Rules of encounter: designing conventions for automated negotiation among computers. MIT press, 1994.
  69. Emergence and collapse of reciprocity in semiautomatic driving coordination experiments with humans. Proceedings of the National Academy of Sciences, 120(51):e2307804120, 2023.
  70. Y. Shoham and K. Leyton-Brown. Multiagent systems: Algorithmic, game-theoretic, and logical foundations. Cambridge University Press, 2008.
  71. Mastering the game of Go without human knowledge. Nature, 550(7676):354–359, 2017.
  72. M. K. Sohrabi and H. Azgomi. A survey on the combined use of optimization methods and game theory. Archives of Computational Methods in Engineering, 27(1):59–80, 2020.
  73. The application of artificial intelligence in electronic commerce. In Journal of Physics: Conference Series, volume 1302, page 032030. IOP Publishing, 2019.
  74. P. Stone and M. Veloso. Multiagent systems: A survey from a machine learning perspective. Autonomous Robots, 8:345–383, 2000.
  75. K. Tuyls and S. Parsons. What evolutionary game theory tells us about multiagent learning. Artificial Intelligence, 171(7):406–416, 2007.
  76. Large language models still can’t plan (a benchmark for LLMs on planning and reasoning about change). arXiv preprint arXiv:2206.10498, 2022.
  77. Generative agent-based modeling with actions grounded in physical, social, or digital space using Concordia, 2023.
  78. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 575(7782):350–354, 2019.
  79. J. von Neumann. Zur theorie der gesellschaftsspiele. Mathematische Annalen, 100(1):295–320, 1928. 10.1007/BF01448847.
  80. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837, 2022.
  81. J. W. Weibull. Evolutionary game theory. MIT press, 1997.
  82. M. Wellman. Trading agents. Springer Nature, 2022.
  83. A prompt pattern catalog to enhance prompt engineering with ChatGPT. arXiv preprint arXiv:2302.11382, 2023.
  84. M. Wooldridge. An introduction to multiagent systems. John wiley & sons, 2009.
  85. Outracing champion Gran Turismo drivers with deep reinforcement learning. Nature, 602(7896):223–228, 2022.
  86. Translating natural language to planning goals with large-language models. arXiv preprint arXiv:2302.05128, 2023.
  87. Large language models as optimizers. arXiv preprint arXiv:2309.03409, 2023.
  88. Computing optimal equilibria and mechanisms via learning in zero-sum extensive-form games. arXiv preprint arXiv:2306.05216, 2023.
  89. Large language models are human-level prompt engineers. arXiv preprint arXiv:2211.01910, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Ian Gemp (36 papers)
  2. Yoram Bachrach (43 papers)
  3. Marc Lanctot (60 papers)
  4. Roma Patel (16 papers)
  5. Vibhavari Dasagi (13 papers)
  6. Luke Marris (23 papers)
  7. Georgios Piliouras (130 papers)
  8. Karl Tuyls (58 papers)
  9. Siqi Liu (94 papers)
Citations (10)
Youtube Logo Streamline Icon: https://streamlinehq.com