Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

History Filtering in Imperfect Information Games: Algorithms and Complexity (2311.14651v1)

Published 24 Nov 2023 in cs.GT and cs.AI

Abstract: Historically applied exclusively to perfect information games, depth-limited search with value functions has been key to recent advances in AI for imperfect information games. Most prominent approaches with strong theoretical guarantees require subgame decomposition - a process in which a subgame is computed from public information and player beliefs. However, subgame decomposition can itself require non-trivial computations, and its tractability depends on the existence of efficient algorithms for either full enumeration or generation of the histories that form the root of the subgame. Despite this, no formal analysis of the tractability of such computations has been established in prior work, and application domains have often consisted of games, such as poker, for which enumeration is trivial on modern hardware. Applying these ideas to more complex domains requires understanding their cost. In this work, we introduce and analyze the computational aspects and tractability of filtering histories for subgame decomposition. We show that constructing a single history from the root of the subgame is generally intractable, and then provide a necessary and sufficient condition for efficient enumeration. We also introduce a novel Markov Chain Monte Carlo-based generation algorithm for trick-taking card games - a domain where enumeration is often prohibitively expensive. Our experiments demonstrate its improved scalability in the trick-taking card game Oh Hell. These contributions clarify when and how depth-limited search via subgame decomposition can be an effective tool for sequential decision-making in imperfect information settings.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. The complexity of decision versus search. SIAM Journal on Computing, 23(1):97–119, 1994.
  2. Superhuman AI for multiplayer poker. Science, 365(6456):885–890, 2019.
  3. Combining deep reinforcement learning and search for imperfect-information games. Advances in Neural Information Processing Systems, 33:17057–17069, 2020.
  4. Improving state evaluation, inference, and search in trick-based card games. In IJCAI, pages 1407–1413, 2009.
  5. Deep Blue. Artificial Intelligence, 134(1-2):57–83, 2002.
  6. The dynamics of reinforcement learning in cooperative multiagent systems. AAAI/IAAI, 1998(746-752):2, 1998.
  7. Information Set Monte Carlo Tree Search. IEEE Transactions on Computational Intelligence and AI in Games, 4(2):120–143, 2012.
  8. Optimally solving dec-pomdps as continuous-state mdps. Journal of Artificial Intelligence Research, 55:443–497, 2016.
  9. Theoretical improvements in algorithmic efficiency for network flow problems. Journal of the ACM (JACM), 19(2):248–264, 1972.
  10. Scalable online planning via reinforcement learning fine-tuning. Advances in Neural Information Processing Systems, 34:16951–16963, 2021.
  11. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on pattern analysis and machine intelligence, pages 721–741, 1984.
  12. Matthew L Ginsberg. GIB: Imperfect information in a computationally challenging game. Journal of Artificial Intelligence Research, 14:303–358, 2001.
  13. Olle Häggström et al. Finite Markov chains and algorithmic applications, volume 52. Cambridge University Press, 2002.
  14. W Keith Hastings. Monte Carlo sampling methods using Markov chains and their applications. Biometrika, pages 97–109, 1970.
  15. Random generation of combinatorial structures from a uniform distribution. Theoretical computer science, 43:169–188, 1986.
  16. Rethinking formal models of partially observable multiagent decision making. arXiv preprint arXiv:1906.11110, 2019.
  17. Value functions for depth-limited solving in imperfect-information games. arXiv preprint arXiv:1906.06412, 2020.
  18. David NL Levy. The million pound bridge program. Heuristic Programming in Artificial Intelligence. The First Computer Olympiad, pages 95-103. Ellis Horwood, 1989.
  19. Equation of state calculations by fast computing machines. The journal of chemical physics, 21(6):1087–1092, 1953.
  20. Deepstack: Expert-level artificial intelligence in heads-up no-limit poker. Science, 356(6337):508–513, 2017.
  21. Decentralized stochastic control with partial history sharing: A common information approach. IEEE Transactions on Automatic Control, 58(7):1644–1658, 2013.
  22. Frans Adriaan Oliehoek. Sufficient plan-time statistics for decentralized pomdps. In Twenty-Third International Joint Conference on Artificial Intelligence, 2013.
  23. David Parlett. The Penguin Book of Card Games. Penguin UK, 2008.
  24. Information set generation in partially observable games. In Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012.
  25. Solving the game of Checkers. Games of no chance, 29:119–133, 1996.
  26. Player of games. arXiv preprint arXiv:2112.03178, 2021.
  27. Learning to guess opponent’s information in large partially observable games. In Proc. AAAI Workshop Reinforcement Learning in Games, 2021.
  28. Mastering the game of Go without human knowledge. Nature, 550(7676):354–359, 2017.
  29. A fine-tuning approach to belief state modeling. In International Conference on Learning Representations, 2021.
  30. Particle value functions in imperfect information games. In AAMAS Adaptive and Learning Agents Workshop, 2021.

Summary

We haven't generated a summary for this paper yet.