Emergent Mind

Grandmaster-Level Chess Without Search

(2402.04494)
Published Feb 7, 2024 in cs.LG , cs.AI , and stat.ML

Abstract

The recent breakthrough successes in machine learning are mainly attributed to scale: namely large-scale attention-based architectures and datasets of unprecedented scale. This paper investigates the impact of training at scale for chess. Unlike traditional chess engines that rely on complex heuristics, explicit search, or a combination of both, we train a 270M parameter transformer model with supervised learning on a dataset of 10 million chess games. We annotate each board in the dataset with action-values provided by the powerful Stockfish 16 engine, leading to roughly 15 billion data points. Our largest model reaches a Lichess blitz Elo of 2895 against humans, and successfully solves a series of challenging chess puzzles, without any domain-specific tweaks or explicit search algorithms. We also show that our model outperforms AlphaZero's policy and value networks (without MCTS) and GPT-3.5-turbo-instruct. A systematic investigation of model and dataset size shows that strong chess performance only arises at sufficient scale. To validate our results, we perform an extensive series of ablations of design choices and hyperparameters.

Overview

  • The paper explores the potential of large-scale models and supervised learning to achieve grandmaster-level chess performance without search algorithms.

  • Using action-values from Stockfish 16, the researchers trained a 270 million parameter transformer model to predict winning probabilities from chess positions.

  • The model achieved a Lichess blitz Elo of 2895, surpassing AlphaZero and GPT-3.5-turbo-instruct in some metrics without using domain-specific enhancements or search techniques.

  • The study shows the impact of model and dataset scale on chess AI performance, suggesting a future where complex reasoning could be captured by neural predictors.

Introduction

The breakthroughs in AI over the past few years have been driven by the application of large-scale models and massive datasets. This paper investigates the application of such paradigms to the game of chess, traditionally dominated by engines using deep search algorithms and large databases of heuristics. Through the lens of supervised learning, the authors of the paper hypothesize that it is possible to achieve strong chess performance purely from learned action-values, without the traditional explicit search algorithms.

Methodology

The team constructed a dataset by extracting and annotating 10 million games from Lichess with action-values sourced from the elite chess engine Stockfish 16. A transformer model with 270 million parameters was then trained to predict win probabilities for given chess board positions. The researchers systematically studied the effects of varying model sizes and dataset scales to understand their impact on generalization and chess performance.

Results and Contributions

The model's prowess is commendable, with a Lichess blitz Elo of 2895, which situates it in the grandmaster echelon. It also outperforms AlphaZero's policy and value networks—sans the Monte Carlo Tree Search—and GPT-3.5-turbo-instruct over a select set of metrics, solidifying the notion of neural predictors as strong generalizers. Key contributions include:

  • A neural predictor trained on millions of annotated board states capturing Stockfish 16's expertise.
  • A grandmaster-level policy derived from this predictor, capable of tackling challenging chess puzzles, without any domain-specific enhancements or search algorithms.
  • Demonstrated significance of scale for achieving robust generalization and performance in chess, accompanied by a detailed analysis through rigorous ablations and scaling studies.

Discussion and Implications

While achieving overwhelming success, the study acknowledges the limitations of lacking historical context which traditional engines use for strategic planning, exposing minor weaknesses. These were pragmatically mitigated through workarounds that, however, do not imply a search algorithm. Furthermore, the study clarifies the scale needed for similar neural predictors to bridge the gap between current model performance and oracle-powered engines like Stockfish 16.

The work underlines a shift in AI research, suggesting that highly complex algorithmic reasoning—once deemed exclusive to sophisticated search-based systems—can be distilled into transformer models using supervised learning. As such, transformers are not simply pattern recognition systems, but powerful approximators capable of substituting intricate algorithmic processes. The implications for future research are profound, promising further shifts in how AI is applied across varied cognitive domains.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.

YouTube
HackerNews
Grandmaster-Level Chess Without Search (197 points, 130 comments)
Reddit
[R] Grandmaster-Level Chess Without Search (56 points, 37 comments) in /r/MachineLearning
[Google DeepMind]-Grandmaster-Level Chess Without Search (28 points, 4 comments) in /r/singularity
Grandmaster-Level Chess Without Search (11 points, 5 comments) in /r/ComputerChess
[2402.04494] Grandmaster-Level Chess Without Search (10 points, 1 comment) in /r/LearningMachines
"Grandmaster-Level Chess Without Search", Ruoss et al 2024 (3 points, 2 comments) in /r/mlscaling
Grandmaster-Level Chess Without Search (1 point, 1 comment) in /r/patient_hackernews
References
  1. Learning to Play Chess from Textbooks (LEAP): a Corpus for Evaluating Chess Moves based on Sentiment Analysis
  2. Gemini: A Family of Highly Capable Multimodal Models
  3. JAX: composable transformations of Python+NumPy programs, 2018. http://github.com/google/jax.

  4. Language models are few-shot learners. In NeurIPS
  5. C. Burt. Faster than thought: A symposium on digital computing machines. edited by b. v. bowden. British Journal of Statistical Psychology
  6. Deep blue. Artif. Intell.
  7. N. Carlini. Playing chess with LLMs. https://nicholas.carlini.com/writing/2023/chess-llm.html

  8. R. Coulom. Whole-history rating: A bayesian rating system for players of time-varying strength. In Computers and Games
  9. Representation Matters: The Game of Chess Poses a Challenge to Vision Transformers
  10. Deepchess: End-to-end deep neural network for automatic learning in chess. In ICANN (2)
  11. The DeepMind JAX Ecosystem, 2020. http://github.com/google-deepmind.

  12. Learning Chess With Language Models and Transformers
  13. ChessGPT: Bridging Policy Learning and Language Modeling
  14. Convolutional sequence to sequence learning. In ICML
  15. B. A. Gramaje. Exploring GPT’s capabilities in chess-puzzles. Master’s thesis, Universitat Politècnica de València
  16. G. Haworth and N. Hernandez. The 20thth{}^{\mbox{th}}start_FLOATSUPERSCRIPT th end_FLOATSUPERSCRIPT top chess engine championship, TCEC20. J. Int. Comput. Games Assoc.
  17. Haiku: Sonnet for JAX, 2020. http://github.com/deepmind/dm-haiku.

  18. Training Compute-Optimal Large Language Models
  19. Justaz. Exact ratings for everyone on lichess. https://lichess.org/@/justaz/blog/exact-ratings-for-everyone-on-lichess/klIoAEAU

  20. SentiMATE: Learning to play Chess through Natural Language Processing
  21. D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In ICLR (Poster)
  22. Neural Networks for Chess
  23. Giraffe: Using Deep Reinforcement Learning to Play Chess
  24. GPT-4 Technical Report
  25. Stockfish, 2008. https://stockfishchess.org.

  26. M. Sadler and N. Regan. Game Changer: AlphaZero’s Groundbreaking Chess Strategies and the Promise of AI. New In Chess
  27. Mastering atari, go, chess and shogi by planning with a learned model. Nat.
  28. GLU Variants Improve Transformer
  29. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
  30. A. Stöckl. Watching a language model learning chess. In RANLP
  31. S. Thrun. Learning to play the game of chess. In NIPS
  32. Chess as a testbed for language model state tracking. In AAAI
  33. LLaMA: Open and Efficient Foundation Language Models
  34. Llama 2: Open Foundation and Fine-Tuned Chat Models
  35. Deep Pepper: Expert Iteration based Chess agent in the Reinforcement Learning Setting
  36. Attention is all you need. In NIPS

Show All 36