Kolmogorov–Smirnov Statistic

Updated 22 February 2026

Kolmogorov–Smirnov statistic is a nonparametric measure that quantifies the maximum distance between empirical and theoretical cumulative distribution functions, forming the basis for one-sample and two-sample tests.
It offers exact finite-sample laws and is computed efficiently using linear time algorithms, which is critical for model validation and goodness-of-fit testing.
Modern extensions, including variance weighting and Bayesian adaptations, enhance its sensitivity to tail differences and parameter uncertainties in practical statistical applications.

The Kolmogorov–Smirnov (KS) statistic is a canonical nonparametric measure for quantifying the distance between empirical and theoretical cumulative distribution functions (CDFs), or between two empirical CDFs. It underpins a family of hypothesis tests—both one-sample and two-sample variants—that are pivotal in model validation, goodness-of-fit testing, and distributional comparison. The KS statistic is distinguished by its distribution-free properties under the null, exact finite-sample laws, tractable computation, and prominent role as the limiting object in the Dvoretzky–Kiefer–Wolfowitz–Massart inequalities and classical empirical process theory. The following sections present a technical exposition of the KS statistic, its mathematical structure, computation, asymptotic and nonasymptotic properties, and specialized results.

1. Mathematical Definition and Variational Structure

Let $\{X_1, \dots, X_n\}$ be i.i.d. real random variables with unknown CDF $F$ , and let $F_n(x) = n^{-1}\sum_{i=1}^n 1\{X_i \leq x\}$ denote the empirical CDF. For the one-sample test, given a hypothesized CDF $F_0$ , the KS statistic is

$D_n = \sup_{x \in \mathbb{R}} |F_n(x) - F_0(x)|,$

with one-sided versions

$D_n^+ = \sup_x (F_n(x) - F_0(x)), \quad D_n^- = \sup_x (F_0(x) - F_n(x)).$

For the two-sample problem, with empirical CDFs $F_m(x), G_n(x)$ from samples $\{X_1,\dots,X_m\}\sim P$ , $\{Y_1,\dots,Y_n\}\sim Q$ ,

$T_{m,n} = \sup_{x \in \mathbb{R}} |F_m(x) - G_n(x)|.$

The KS statistic admits a variational interpretation: for $F$ 0 (total variation ball),

$F$ 1

and the corresponding representer theorem shows this is maximized by the indicator family $F$ 2, reducing the problem to the CDFs (Sadhanala et al., 2019).

2. Exact Distributional Theory and Finite-Sample Formulae

For moderate $F$ 3, the finite-sample law of $F$ 4 can be written explicitly. The Smirnov–Birnbaum–Tingey formula provides the exact distribution: $F$ 5 with the survival function as

$F$ 6

and analogous expressions for $F$ 7 (Mulbregt, 2018).

For the two-sample statistic, under the null $F$ 8, and for $F$ 9,

$F_n(x) = n^{-1}\sum_{i=1}^n 1\{X_i \leq x\}$ 0

where $F_n(x) = n^{-1}\sum_{i=1}^n 1\{X_i \leq x\}$ 1 is the number of lattice paths from $F_n(x) = n^{-1}\sum_{i=1}^n 1\{X_i \leq x\}$ 2 to $F_n(x) = n^{-1}\sum_{i=1}^n 1\{X_i \leq x\}$ 3 that do not exit the corridor $F_n(x) = n^{-1}\sum_{i=1}^n 1\{X_i \leq x\}$ 4, and admits the Hodges recursion (Viehmann, 2021). Pointwise combinatorial expressions for the discrete hitting times enable evaluation of $F_n(x) = n^{-1}\sum_{i=1}^n 1\{X_i \leq x\}$ 5-values and explicit analysis of the supremum event's realization (Cui et al., 27 Feb 2025).

3. Asymptotic Law and DKWM/Massart Inequality

A central result is the Kolmogorov limit theorem: Under the null and as $F_n(x) = n^{-1}\sum_{i=1}^n 1\{X_i \leq x\}$ 6,

$F_n(x) = n^{-1}\sum_{i=1}^n 1\{X_i \leq x\}$ 7

where $F_n(x) = n^{-1}\sum_{i=1}^n 1\{X_i \leq x\}$ 8 is a standard Brownian bridge. The limit CDF is

$F_n(x) = n^{-1}\sum_{i=1}^n 1\{X_i \leq x\}$ 9

with rapidly convergent series or via the Jacobi theta identity (Frommert et al., 2011, Mulbregt, 2018, Cui et al., 27 Feb 2025). For the two-sample case, under $F_0$ 0 with $F_0$ 1,

$F_0$ 2

The Dvoretzky–Kiefer–Wolfowitz (DKWM) inequality and its optimal Massart refinement assert for the two-sided case,

$F_0$ 3

and for the one-sided case,

$F_0$ 4

providing tight nonasymptotic control and facilitating the construction of simultaneous confidence bands for $F_0$ 5 at prescribed significance levels (Greene et al., 2015, Cui et al., 27 Feb 2025).

4. Computational Algorithms and Numerical Stability

For empirical implementation, the KS statistic in one- and two-sample settings can be computed in linear time by merging sorted samples and tracking the maximal gap between ECDFs—using a two-pointer scan for the pooled array requires $F_0$ 6 time and $F_0$ 7 auxiliary memory (Sadhanala et al., 2019). For computing exact $F_0$ 8-values, stable recurrences are necessary due to combinatorial explosion and risks of catastrophic cancellation; the C-recurrence for the two-sample test maintains all intermediate values in $F_0$ 9 as convex combinations, avoiding overflow and ensuring robust double-precision answers for all $D_n = \sup_{x \in \mathbb{R}} |F_n(x) - F_0(x)|,$ 0 and finite $D_n = \sup_{x \in \mathbb{R}} |F_n(x) - F_0(x)|,$ 1 (Viehmann, 2021). In the one-sample setting, efficient switching between direct summation, closed-form formulae for small $D_n = \sup_{x \in \mathbb{R}} |F_n(x) - F_0(x)|,$ 2, and asymptotic series is critical for accuracy and speed (Mulbregt, 2018).

For very large samples, asymptotic approximations using the Kolmogorov series are accurate, requiring only a handful of terms even at double-precision. Inverse survival/CDF functions and quantiles require root-finding, for which bracketed Newton–Raphson initialized by asymptotic or series inversion yields rapid, globally convergent evaluation (Mulbregt, 2018).

5. Power, Sensitivity, and Discrete/Ordered Extensions

The KS statistic is maximally sensitive to differences in the center of the distribution (where CDFs are steepest) and is less powerful for alternatives differing only in the tails. Remedies such as the Anderson–Darling or variance-weighted KS statistics reweight tail discrepancies, but the standard KS test remains uniformly sensitive except in extremes (Sadhanala et al., 2019, Dowd, 2020). In discrete settings, the ordering of bins becomes critical: with naturally ordered categories, the discrete KS statistic outperforms permutation-invariant metrics like Euclidean distance, especially for sparse or low-count situations; for nominal data, the Euclidean distance is preferred due to the arbitrariness of bin order for $D_n = \sup_{x \in \mathbb{R}} |F_n(x) - F_0(x)|,$ 3 (Carruth et al., 2012).

A normalized version $D_n = \sup_{x \in \mathbb{R}} |F_n(x) - F_0(x)|,$ 4 enables sample-size-independent thresholding and provides a direct metric-like distance for visualization or clustering across samples of heterogeneous sizes (Fabbri et al., 2017).

6. Extensions, Bayesian and Simulation Contexts, and Modern Developments

The plug-in (modified) KS statistic, where one estimates parameters (e.g., maximum likelihood or Bayesian posterior mean) and assesses

$D_n = \sup_{x \in \mathbb{R}} |F_n(x) - F_0(x)|,$ 5

has asymptotically well-calibrated posterior predictive $D_n = \sup_{x \in \mathbb{R}} |F_n(x) - F_0(x)|,$ 6-values under regularity conditions, converging to the uniform law under the null. This resolves the generic non-uniformity encountered with most other test statistics in Bayesian model checking and exhibits robust finite-sample behavior and broad sensitivity under alternatives (Shen, 18 Apr 2025).

For inference under input uncertainty or simulation output (e.g., in quantifying Monte Carlo/model calibration impact on output distributions), the KS statistic generalizes to consider the supremum of a sum of a Brownian bridge and a mean-zero Gaussian process determined by the propagation of input model error. Subsampling and Monte Carlo approaches enable practical, coverage-guaranteed confidence bands for entire output CDFs, superseding classical bands under combined input/output sampling (Chen et al., 2024).

Recent algorithmic advances have produced quadratic and nearly linear time procedures for exact computation of $D_n = \sup_{x \in \mathbb{R}} |F_n(x) - F_0(x)|,$ 7-values and thresholds for one-sided KS statistics, via FFT convolution and boundary-crossing reduction, enabling scaling to massive datasets without precision loss (Moscovich, 2020). Higher-order KS tests extend the IPM formulation to balls with higher total variation, recovering standard KS for $D_n = \sup_{x \in \mathbb{R}} |F_n(x) - F_0(x)|,$ 8 and achieving greater tail sensitivity for $D_n = \sup_{x \in \mathbb{R}} |F_n(x) - F_0(x)|,$ 9, albeit at increased computational cost in general (Sadhanala et al., 2019).

7. Practical Recommendations and Impact

The KS statistic remains a fundamental tool for nonparametric hypothesis testing, model assessment, and distributional comparison. For continuous data without parameter estimation, its properties—distribution-free null law, fast computation, and well-understood finite- and large-sample regimes—make it the default choice. In small samples, use exact recursions; for moderate to large $D_n^+ = \sup_x (F_n(x) - F_0(x)), \quad D_n^- = \sup_x (F_0(x) - F_n(x)).$ 0, apply Kolmogorov asymptotics; for unknown parameters, apply parametric bootstrap or adjusted covariance. For discrete/ordinal data, KS exploits ordering, but the Euclidean norm offers stability for unordered categories. Modern extensions (variance weighting, higher-order norms, Bayesian calibration, and hybrid confidence bands) address sensitivity in tails, parameter uncertainty, and simulation scenarios. Practitioners are advised to accompany reported $D_n^+ = \sup_x (F_n(x) - F_0(x)), \quad D_n^- = \sup_x (F_0(x) - F_n(x)).$ 1 or $D_n^+ = \sup_x (F_n(x) - F_0(x)), \quad D_n^- = \sup_x (F_0(x) - F_n(x)).$ 2 with exact or conservative $D_n^+ = \sup_x (F_n(x) - F_0(x)), \quad D_n^- = \sup_x (F_0(x) - F_n(x)).$ 3-values, and/or normalized statistics ( $D_n^+ = \sup_x (F_n(x) - F_0(x)), \quad D_n^- = \sup_x (F_0(x) - F_n(x)).$ 4), and, for greater robustness, to supplement with tests specifically targeted to known alternative structures or tail behaviors (Sadhanala et al., 2019, Fabbri et al., 2017, Cui et al., 27 Feb 2025, Chen et al., 2024).