Papers
Topics
Authors
Recent
2000 character limit reached

Glicko Rating System

Updated 8 December 2025
  • Glicko Rating System is a probabilistic framework that extends Elo by modeling both mean skill and uncertainty via Bayesian updates.
  • It dynamically updates ratings using logistic link functions and Kalman filter-inspired steps to robustly predict head-to-head contest outcomes.
  • Extensions include explicit draw modeling, home-field advantage adjustments, and league transition factors, broadening its applicability in analytics.

The Glicko rating system is a probabilistic framework for estimating and tracking the playing strength of competitors from observed outcomes in head-to-head contests. Originally devised as a principled extension of the Elo rating system, Glicko and its successor, Glicko-2, incorporate explicit modeling of both mean skill and the uncertainty (variance) in that estimate, with dynamic adjustment of rating confidence over time. The Glicko methodology underpins a broad class of Bayesian and Kalman-filter-based rating algorithms used in sports analytics, competitive gaming, and algorithm benchmarking, and has motivated further generalizations that adapt to draws, promotions, and stochasticity in outcome generation.

1. Statistical Foundations and Model Parameters

The classical Glicko system encodes each competitor by three parameters: a location parameter (mean skill) μ\mu, a rating deviation (uncertainty) ϕ\phi, and a volatility parameter σ\sigma controlling the allowed drift in skill between rating periods. The latent skill θ\theta of each player or team is modeled as a Gaussian random variable, θNormal(μ,ϕ2)\theta \sim \mathrm{Normal}(\mu, \phi^2). Within any match, outcomes against an opponent jj are modeled via a logistic link as in the Bradley–Terry framework: E(μi,μj,ϕj)=11+exp(g(ϕj)(μiμj)),g(ϕj)=11+3ϕj2/π2.E(\mu_i, \mu_j, \phi_j) = \frac{1}{1 + \exp(-g(\phi_j)(\mu_i - \mu_j))}, \quad g(\phi_j) = \frac{1}{\sqrt{1 + 3\phi_j^2 / \pi^2}}. The rating deviation ϕ\phi reflects epistemic uncertainty—large for infrequent or new competitors, and shrinking as more contests are observed. The volatility parameter σ\sigma controls the system's dynamic response to non-stationarity in skill, allowing for increases in ϕ\phi in the absence of matches or after structural changes (e.g., team composition) (Shelopugin et al., 2023, Bober-Irizar et al., 1 Oct 2024).

2. Core Glicko-2 Updating Equations

Glicko-2 implements Bayesian-style mean and variance updates for each rating period, aggregating observed matches as sufficient statistics. For nn matches jj, outcomes sj{0,1}s_j \in \{0,1\}, the following intermediates are computed: v=[j=1ng(ϕj)2Ej(1Ej)]1,Δ=vj=1ng(ϕj)(sjEj).v = \left[\sum_{j=1}^n g(\phi_j)^2 E_j (1 - E_j)\right]^{-1}, \quad \Delta = v \sum_{j=1}^n g(\phi_j)(s_j - E_j). The volatility σ\sigma' is updated by numerically solving: f(x)=ex(Δ2ϕ2vex)2(ϕ2+v+ex)2xaτ2=0,a=lnσ2,f(x) = \frac{e^x(\Delta^2 - \phi^2 - v - e^x)}{2(\phi^2 + v + e^x)^2} - \frac{x - a}{\tau^2} = 0, \quad a = \ln \sigma^2, typically via the iterative algorithm of Glickman (2012). Subsequently,

ϕ=ϕ2+σ2,ϕ=[1(ϕ)2+1v]1/2,μ=μ+(ϕ)2j=1ng(ϕj)(sjEj).\phi^* = \sqrt{\phi^2 + \sigma'^2}, \quad \phi' = \left[\frac{1}{(\phi^*)^2} + \frac{1}{v}\right]^{-1/2}, \quad \mu' = \mu + (\phi')^2 \sum_{j=1}^n g(\phi_j)(s_j - E_j).

Transformations between the Glicko and Glicko-2 scales involve an affine transformation, e.g., μ=(r1500)/173.7178\mu = (r-1500)/173.7178 for μ\mu the location parameter, where rr is the classical Elo-scale rating. These updates yield both the new rating and an updated quantification of confidence (Shelopugin et al., 2023, Bober-Irizar et al., 1 Oct 2024, Cardoso et al., 2021).

3. Extensions: Draws, Promotion, and Domain Adaptations

Glicko-2's original binary win–loss paradigm limits its adequacy in settings where draws or domain transitions are generic. Contemporary elaborations have considered:

  • Explicit Draw Modeling: In soccer and chess, draws are neither rare nor uniformly distributed. The Glicko-2 framework can be extended to a three-state softmax:

Pwin=exp(g(ϕj)(μiμj))1+exp(g(ϕj)(μiμj))+exp(d+s^), Pdraw=exp(d+s^)1+exp(g(ϕj)(μiμj))+exp(d+s^), Ploss=11+exp(g(ϕj)(μiμj))+exp(d+s^),\begin{aligned} P_{\text{win}} &= \frac{\exp(g(\phi_j)(\mu_i - \mu_j))}{1 + \exp(g(\phi_j)(\mu_i - \mu_j)) + \exp(d + \hat{s})}, \ P_{\text{draw}} &= \frac{\exp(d + \hat{s})}{1 + \exp(g(\phi_j)(\mu_i - \mu_j)) + \exp(d + \hat{s})}, \ P_{\text{loss}} &= \frac{1}{1 + \exp(g(\phi_j)(\mu_i - \mu_j)) + \exp(d + \hat{s})}, \end{aligned}

where s^\hat{s} is an exogenous draw-probability and dd a learnable bias. This probabilistic output directly replaces E()E(\cdot) in the update formulae (Shelopugin et al., 2023).

  • Home-field Advantage: A constant hh is added to μ\mu when the competitor is home, with context-sensitive reductions (e.g., pandemic periods).
  • League-transition Factors: For promotions/relegations, teams entering a new competition receive an additive rating shift μl\mu_l to reflect expected adaptation cost or benefit, with overall means re-centered after these transitions.

These modifications have been shown to marginally but meaningfully improve the predictive accuracy and interpretability within sports analytics, especially for soccer league modeling (Shelopugin et al., 2023).

4. Theoretical Connections: Bayesian Filtering and Kalman Analogy

Glicko's mathematical structure generalizes to a Bayesian state–space process for skills θt\boldsymbol{\theta}_t: θtθt1N(θt1,τ2),p(ytθt)exp((zt;yt)),\theta_t | \theta_{t-1} \sim \mathcal{N}(\theta_{t-1}, \tau^2),\quad p(y_t | \theta_t) \propto \exp(\ell(z_t; y_t)), with logistic or Thurstonian links for p(ytθt)p(y_t|\theta_t). The Glicko update equations are mathematically equivalent to scalar-coefficient extended Kalman filter steps:

  • Posterior mean updates correspond to Newton steps in the penalized likelihood.
  • Posterior variance (RD) is the scalar inverse Hessian of the penalized log-likelihood at the mode. Elo emerges by fixing the variance, Glicko employs scalar variance, and TrueSkill utilizes a vector or matrix covariance with full Bayesian message passing. This unification positions Glicko as the 1D Kalman filter for player skill estimation (Szczecinski et al., 2021).

5. Comparisons, Domain Generalization, and Practical Performance

Empirical analyses of Glicko-2 demonstrate superior data efficiency over fixed-KK Elo and simplicity compared to fully Bayesian vector-covariance frameworks like TrueSkill. Comparative studies in CS:GO show Glicko-2 outperforming basic Elo and 1v1 TrueSkill on small and moderate data, with TrueSkill gaining edge only at very large scales with individual-level rating granularity (Bober-Irizar et al., 1 Oct 2024). In machine learning benchmarks, Glicko-2 assimilates tournament outcomes among classifiers (using IRT-based win/draw/loss assignments), yielding ratings that quantify both rank and uncertainty (Cardoso et al., 2021).

Glicko-2-based league ratings not only provide accurate team strength metrics but also serve as interpretable and transferable features for broader analytics, such as predicting future performance, player migration effects, or classifier capability diagnostics (Shelopugin et al., 2023, Cardoso et al., 2021).

6. Recent Generalizations and Robust Alternatives

Further generalizations incorporate non-Gaussian uncertainty and stochastic elements via:

  • Explicit Luck Modeling: The “paired-chance” extension hybridizes the logistic link with a uniform noise floor parameter β\beta, accommodating games with significant random outcome variation. Discrete density-based updates replace Gaussian updating, yielding bounded learning rates and less erratic rating swings (Cowan, 2023).
  • Strength-dependent Tie Probability: For domains (e.g., chess) where tie rates scale with average competitor strength, the modern extension of Glicko introduces explicit parameters (β0,β1)(\beta_0, \beta_1) for the log-draw-odds, with closed-form marginal updates via Newton–Raphson on the posterior log-density. This addresses systematic bias in ratings for high draw-probability cohorts and has been successfully deployed in operational chess federations (Glickman, 12 Jun 2025).
  • Laplace-Driven Variance Evolution: Recent models (sometimes termed “vElo”) use Laplace-approximated posterior variances and player-specific variance increments, with per-match updates accommodating player heterogeneity and ensuring robust forecast accuracy, particularly for new or volatile participants (Hua et al., 2023).

These approaches further mitigate the limitations of moment-based Gaussian updates, offering exact Bayesian integration (via grid or quadrature) at modest computational cost and improved generalization to atypical or noisier domains.

7. Implementation, Optimization, and Parameter Choices

Glicko-2 deployment involves the following operational features:

  • Parameters per league or domain: {μinit,μnew,μl,ϕs,h,hp}\{\mu_{\text{init}}, \mu_{\text{new}}, \mu_l, \phi_s, h, h_p\}, with global {ϕ0,σ0,d}\{\phi_0, \sigma_0, d\}.
  • All free parameters can be estimated via maximum likelihood (minimizing log-loss on historical match outcomes) using gradient-based or derivative-free optimizers.
  • The volatility update typically solves for σ\sigma' within a tolerance of 10610^{-6}.
  • Adaptation to sports without draws or with team-based outcomes requires only straightforward modifications.
  • In non-human competitions, such as classifier tournaments, period grouping permits Glicko-2 to robustly summarize per-algorithm performance and uncertainty (Cardoso et al., 2021, Shelopugin et al., 2023).

In practical terms, typical defaults, such as τ[0.3,1.2]\tau \in [0.3, 1.2] for volatility and σ00.06\sigma_0 \approx 0.06 on the Glicko-2 scale, work well, with little sensitivity shown unless extremely rapid skill evolution is anticipated (Bober-Irizar et al., 1 Oct 2024).


By leveraging Bayesian updating and domain-specific extensions, the Glicko family of rating systems has become foundational in quantifying and forecasting competitive skill. Rigorous probabilistic structure, empirical robustness, and interpretability account for its enduring relevance in both sports analytics and algorithmic performance assessment (Shelopugin et al., 2023, Bober-Irizar et al., 1 Oct 2024, Glickman, 12 Jun 2025).

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Glicko Rating System.