A State-Space Perspective on Modelling and Inference for Online Skill Rating (2308.02414v3)
Abstract: We summarise popular methods used for skill rating in competitive sports, along with their inferential paradigms and introduce new approaches based on sequential Monte Carlo and discrete hidden Markov models. We advocate for a state-space model perspective, wherein players' skills are represented as time-varying, and match results serve as observed quantities. We explore the steps to construct the model and the three stages of inference: filtering, smoothing and parameter estimation. We examine the challenges of scaling up to numerous players and matches, highlighting the main approximations and reductions which facilitate statistical and computational efficiency. We additionally compare approaches in a realistic experimental pipeline that can be easily reproduced and extended with our open-source Python package, https://github.com/SamDuffield/abile.
- Particle Markov chain Monte Carlo methods. Journal of the Royal Statistical Society Series B: Statistical Methodology 72(3), 269–342.
- Bradley, R. A. and M. E. Terry (1952). Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons. Biometrika 39(3/4), 324–345.
- Cappé, O. (2011). Online EM algorithm for hidden Markov models. Journal of Computational and Graphical Statistics 20(3), 728–749.
- Introduction to Sequential Monte Carlo. Springer International Publishing.
- TrueSkill Through Time: Revisiting the History of Chess. In J. Platt, D. Koller, Y. Singer, and S. Roweis (Eds.), Advances in Neural Information Processing Systems, Volume 20. Curran Associates, Inc.
- On backward smoothing algorithms. The Annals of Statistics 51(5), 2145 – 2169.
- Davidson, R. R. (1970). On Extending the Bradley-Terry model to Accommodate Ties in Paired Comparison Experiments. Journal of the American Statistical Association 65(329), 317–328.
- Forward smoothing using sequential Monte Carlo.
- Dixon, M. J. and S. G. Coles (1997). Modelling association football scores and inefficiencies in the football betting market. Journal of the Royal Statistical Society: Series C (Applied Statistics) 46(2), 265–280.
- Comparison of resampling schemes for particle filtering. In ISPA 2005. Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, 2005., pp. 64–69. IEEE.
- Sequential Monte Carlo smoothing for general state space hidden Markov models. The Annals of Applied Probability 21(6), 2109 – 2145.
- On sequential Monte Carlo sampling methods for Bayesian filtering. Statistics and computing 10, 197–208.
- Duffield, S. (2024). ghq: Gauss-Hermite quadrature in JAX.
- Duffield, S. and S. S. Singh (2022). Online Particle Smoothing With Application to Map-Matching. IEEE Transactions on Signal Processing 70, 497–508.
- Elo, A. (1978). The Rating of Chessplayers, Past and Present. Ishi Press.
- Evensen, G. (2009). Data assimilation: the ensemble Kalman filter, Volume 2. Springer.
- FIDE (2023). International chess federation.
- Finke, A. and S. S. Singh (2017). Approximate smoothing and parameter estimation in high-dimensional state-space models. IEEE Transactions on Signal Processing 65(22), 5982–5994.
- Bayesian workflow.
- Factorial hidden Markov models. In D. Touretzky, M. Mozer, and M. Hasselmo (Eds.), Advances in Neural Information Processing Systems, Volume 8. MIT Press.
- Glickman, M. E. (1999). Parameter Estimation in Large Dynamic Paired Comparison Experiments. Journal of the Royal Statistical Society: Series C (Applied Statistics) 48(3), 377–394.
- Monte Carlo smoothing for nonlinear time series. Journal of the American statistical association 99(465), 156–168.
- The analysis and forecasting of tennis matches by using a high dimensional dynamic model. Journal of the Royal Statistical Society Series A: Statistics in Society 182(4), 1393–1409.
- Bayesian inference for plackett-luce ranking models. In proceedings of the 26th annual international conference on machine learning, pp. 377–384.
- TrueSkill™: A Bayesian Skill Rating System. In B. Schölkopf, J. Platt, and T. Hoffman (Eds.), Advances in Neural Information Processing Systems, Volume 19. MIT Press.
- Using Elo ratings for match result prediction in association football. International Journal of Forecasting 26(3), 460–470. Sports Forecasting.
- Ingram, M. (2021). How to extend Elo: a Bayesian perspective. Journal of Quantitative Analysis in Sports 17(3), 203–219.
- Joshy, V. (2024). OpenSkill: A faster asymmetric multi-team, multiplayer rating system. Journal of Open Source Software 9(93), 5901.
- Julier, S. J. and J. K. Uhlmann (2004). Unscented filtering and nonlinear estimation. Proceedings of the IEEE 92(3), 401–422.
- On Particle Methods for Parameter Estimation in State-Space Models. Statistical Science 30(3), 328 – 351.
- Analysis of sports data by using bivariate Poisson models. Journal of the Royal Statistical Society: Series D (The Statistician) 52(3), 381–393.
- Modelling Competitive Sports: Bradley-Terry-Elo Models for Supervised and On-Line Learning of Paired Competition Outcomes.
- Kovalchik, S. A. (2016). Searching for the GOAT of tennis win prediction. Journal of Quantitative Analysis in Sports 12(3), 127–138.
- Assessing Approximate Inference for Binary Gaussian Process Classification. Journal of machine learning research 6(10).
- Luce, R. D. (1959). Individual choice behavior: A theoretical analysis.
- Menke, J. E. and T. R. Martinez (2008). A Bradley–Terry artificial neural network model for individual ratings in group competitions. Neural computing and Applications 17, 175–186.
- Continuous-time state-space modelling of the hot hand in basketball. AStA Advances in Statistical Analysis 107(1-2), 313–326.
- TrueSkill 2: An improved Bayesian skill rating system. Technical Report MSR-TR-2018-8, Microsoft.
- Minka, T. P. (2001a). Expectation propagation for approximate Bayesian inference. In J. S. Breese and D. Koller (Eds.), UAI ’01: Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence, University of Washington, Seattle, Washington, USA, August 2-5, 2001, pp. 362–369. Morgan Kaufmann.
- Minka, T. P. (2001b). A family of algorithms for approximate Bayesian inference. Ph. D. thesis, Massachusetts Institute of Technology.
- Neal, R. M. and G. E. Hinton (1998). A view of the EM algorithm that justifies incremental, sparse, and other variants. In Learning in graphical models, pp. 355–368. Springer.
- Ollivier, Y. (2018). Online natural gradient as a kalman filter.
- The hot hand in professional darts. Journal of the Royal Statistical Society Series A: Statistics in Society 183(2), 565–580.
- Pelánek, R. (2016). Applications of the Elo rating system in adaptive educational systems. Computers and Education 98, 169–179.
- Plackett, R. L. (1975). The analysis of permutations. Journal of the Royal Statistical Society Series C: Applied Statistics 24(2), 193–202.
- Rebeschini, P. and R. van Handel (2015). Can local particle filters beat the curse of dimensionality? The Annals of Applied Probability 25(5), 2809 – 2866.
- Exploiting locality in high-dimensional Factorial hidden Markov models. Journal of Machine Learning Research 23(4), 1–34.
- Bayesian filtering and smoothing, Volume 17. Cambridge university press.
- Stefani, R. (2011). The methodology of officially recognized international sports rating systems. Journal of Quantitative Analysis in Sports 7(4).
- Understanding draws in Elo rating algorithm. Journal of Quantitative Analysis in Sports 16(3), 211–220.
- Simplified Kalman filter for on-line rating: one-fits-all approach. Journal of Quantitative Analysis in Sports.
- An overview of composite likelihood methods. Statistica Sinica, 5–42.
- Pairwise likelihood inference for general state space models. Econometric Reviews 28(1-3), 170–185.
- Wasserman, L. (2000). Bayesian model selection and model averaging. Journal of mathematical psychology 44(1), 92–107.
- A Bayesian test for the hot hand phenomenon. Journal of Mathematical Psychology 72, 200–209.
- Wheatcroft, E. (2021). Forecasting football matches by predicting match statistics. Journal of Sports Analytics 7(2), 77–97.
- Simulating a basketball match with a homogeneous Markov model and forecasting the outcome. International Journal of Forecasting 28(2), 532–542.