Score-Based CUSUM Procedure
- Score-Based CUSUM Procedure is a statistical method that uses specialized score functions to detect abrupt changes in sequential data under weak assumptions.
- It adapts to diverse scenarios by using signed-rank, Fisher/Hyvärinen, and functional scores, which are suited for nonparametric, high-dimensional, and unnormalized models.
- Practical calibration through threshold tuning and martingale-based false alarm control ensures optimal detection delay and robust performance across various applications.
A score-based CUSUM (Cumulative Sum) procedure is a statistical methodology for detecting abrupt changes (change points) in the distributional characteristics of sequential data, in which cumulative sums are formed not from the classical log-likelihood ratios or raw increments, but from "scores"—functions or statistics computed from the data that are sensitive to changes but can be constructed under weak distributional assumptions or in high-dimensional and unnormalized settings. This generalization encompasses nonparametric, robust, and functionally adaptive CUSUMs, as well as contemporary score-based CUSUMs used in high-dimensional machine learning, energy-based models, and functional data analysis.
1. Classical and General Score-Based CUSUM Principles
The canonical CUSUM process is defined for sequential data , assuming a possible change at an unknown time . For densities (pre-change) and (post-change), the classical CUSUM update is
with an alarm triggered at the first for which for some threshold .
Score-based CUSUM procedures generalize to a function tailored to scenarios where
- explicit likelihoods are intractable or undefined (e.g., unnormalized models),
- robust or nonparametric detection is required,
- data are functional or high-dimensional.
Variants in the literature include CUSUMs based on signed sequential ranks for symmetric distributions (Lombard et al., 2017), Hyvärinen/Fisher score differences for unnormalized models (Wu et al., 2023, Chen et al., 6 Nov 2025, Zhou et al., 22 Jan 2025), functional principal component scores (Torgovitski, 2014), and standard log-likelihood increments when parametric models are available (Baron et al., 13 Feb 2025).
2. Score Functionals: Rank-Based, Fisher/Hyvärinen, and Functional Scores
The choice of score is central to the statistical properties and applicability of the CUSUM. Notable types:
2.1 Signed Sequential Rank Scores
For continuous, symmetric distributions, scores based on signed sequential ranks allow detection of location and scale shifts in a distribution-free and self-starting manner (Lombard et al., 2017). At each time :
- Compute ,
- Compute the rank of among the observed absolute deviations,
- Choose an odd score function , for example,
- Van der Waerden: ,
- Wilcoxon: ,
- Define the normalized score
These scores are robust to heavy-tailedness and nonparametric under in-control.
2.2 Fisher/Hyvärinen Score Differences
For modern unnormalized models (energy-based, Markov random fields, score-based generative models), the score refers to the gradient of the log-density. The Hyvärinen score,
depends only on the unnormalized model and is insensitive to normalizing constants (Wu et al., 2023, Chen et al., 6 Nov 2025, Zhou et al., 22 Jan 2025).
The CUSUM increment uses the difference: with tuned for false-alarm control such that .
2.3 Functional and Projection-Based Scores
For functional data, principal component scores estimate the coordinates of each function in a suitable eigenbasis of the long-run covariance operator. The multivariate CUSUM is built from the cumulative sum of the leading component scores (Torgovitski, 2014): where weights to stabilize variance.
3. Statistical Properties and Asymptotic Optimality
Theoretical guarantees for score-based CUSUM procedures are available under mild conditions:
- Distribution-free In-control ARL: For signed-rank CUSUMs, the ARL in the absence of change depends only on and , independent of the underlying symmetric distribution (Lombard et al., 2017).
- Martingale-based False Alarm Control: When using Hyvärinen scores, one can construct a nonnegative martingale to control the average run length (ARL). If the threshold is , , thus guarantees ARL (Wu et al., 2023, Chen et al., 6 Nov 2025).
- Detection Delay and First-order Optimality: The expected detection delay under post-change is
where is the Fisher divergence between pre- and post-change laws. This matches the CUSUM optimal rate up to the replacement of KL with Fisher divergence (Wu et al., 2023, Chen et al., 6 Nov 2025, Zhou et al., 22 Jan 2025).
- Minimax Delay Metrics: In the Lorden (worst-case) or Pollak (conditional) detection delay regimes, score-based CUSUMs achieve the same leading-order asymptotic performance as classical CUSUM when appropriately calibrated.
For functional data, under weak dependence and using long-run principal components, the score-based CUSUM statistic converges (after scaling and centering) to a Gumbel law under the null by a functional version of the Darling–Erdős theorem (Torgovitski, 2014).
4. Variants and Applications in Modern Settings
The flexibility of score-based CUSUMs lends them to a variety of advanced scenarios:
4.1 Energy-Based and Unnormalized Models
SCUSUM and min-SCUSUM approaches are applicable to machine learning models where only unnormalized likelihoods can be evaluated. The computation reduces to evaluation of score and Laplacian terms, which can be efficiently handled via automatic differentiation (Wu et al., 2023, Chen et al., 6 Nov 2025).
4.2 Denoising Score Matching (DSM) CUSUM
The DSM-CUSUM (Zhou et al., 22 Jan 2025) estimates the score functions via denoising score matching using neural networks, both offline (pre- and post-change score nets) and online (continually adapting the post-change score net to recent windows of data). The Hyvärinen increment is then used in the CUSUM recursion. The method admits theoretical bounds on the detection delay in terms of estimation error and achieves false alarm control via Monte Carlo quantile calibration.
4.3 Multi-Stream and Fault Isolation
In the multi-stream setting, min-SCUSUM operates several parallel CUSUMs, one per stream, using the Hyvärinen increment associated with pre- and post-change hypotheses for each stream. The global detection time is the earliest time any stream's process exceeds its threshold, and the faulty stream is declared based on the maximal process at stopping (Chen et al., 6 Nov 2025). Theoretical bounds on false-identification probability and detection delay are obtained in terms of the number of streams and Fisher divergences.
4.4 Functional Data and High Dimensions
For functional data (curves or high-dimensional signals), projection-based CUSUMs reduce the problem to multivariate detection on leading principal component scores, allowing effective mean-shift detection under complex dependence and slow convergence (Torgovitski, 2014).
5. Implementation Details and Practical Calibration
The design and deployment of score-based CUSUMs require careful attention to threshold calibration, parameter selection, and computational considerations:
- Threshold Selection: In SCUSUM-type procedures, thresholds are selected via explicit ARL control: for desired ARL . In offline applications, Monte Carlo can be used to simulate false alarm levels.
- Reference Value Tuning: For signed-rank CUSUM, the reference constant is chosen as , with determined by the score function and pre-change distribution (Lombard et al., 2017).
- Score Function Estimation: Neural network-based score function estimation via denoising score matching is efficient, requiring only moderate storage for small batches or windows. Both forward and backward passes per time step can be minimized to lower computational cost (Zhou et al., 22 Jan 2025).
- Computational Complexity: For high-dimensional energy models, cost per time step is dominated by O()–O() for score/Laplacian evaluation, but is tractable due to the avoidance of partition function estimation or MCMC (Wu et al., 2023).
- Multistream Complexity: For d parallel streams, overhead is O(d) per time step, allowing scalable detection across high-throughput systems (Chen et al., 6 Nov 2025).
- Functional CUSUM: Estimation of the long-run covariance/eigenbasis is performed with lag-windowed or kernel smoothers, followed by eigen-decomposition. Choice of the number of principal components is critical for power and is guided by explained variance or cross-validation.
6. Connections and Distinctions Relative to Classical Methods
Score-based CUSUM procedures generalize and extend the classical CUSUM in several respects:
| Variant | Classical CUSUM | Signed-Rank/RPC | Fisher/Hyvärinen SCUSUM | DSM-CUSUM |
|---|---|---|---|---|
| Model assumption | Parametric | Symmetric, nonparametric | Unnormalized, nonlinear | Arbitrary, via estimated scores |
| Increment statistic | Log-likelihood ratio | Rank/score function | Hyvärinen score difference | Hyvärinen-score (learned) |
| In-control ARL calibration | Exact via likelihood | Distribution-free tables | Martingale, threshold | Simulation, quantile-based |
| Use case | Gaussian, exponential | Robust, heavy-tailed | Energy models, deep nets | High-dim, ML, out-of-model data |
This tabulation underscores that score-based CUSUMs retain optimality properties under generalized scenarios where the classical log-likelihood ratio is not available, not robust, or computationally feasible only in low dimensions.
7. Limitations, Performance Considerations, and Outlook
While score-based CUSUMs achieve asymptotic optimality and excellent empirical performance in diverse settings, several practical aspects merit attention:
- For signed-rank CUSUMs, the requirement of symmetry and continuity may preclude direct application to skewed or discrete data, though variant statistics for dispersion changes are discussed (Lombard et al., 2017).
- For Hyvärinen/score-based CUSUMs, the accuracy of estimated score functions (especially in DSM-CUSUM) directly impacts sensitivity and delay. Careful tuning of noise scale, window size, and batch updating strategies is essential (Zhou et al., 22 Jan 2025).
- In multi-stream scenarios, choice of thresholds affects both detection delay and fault isolation probability, but explicit formulas in terms of enable predictable risk control (Chen et al., 6 Nov 2025).
- For functional data, slow convergence to Gumbel limits requires Brownian-bridge approximation or Monte Carlo for accurate thresholding in moderate samples (Torgovitski, 2014).
- All procedures benefit from properly matched score functions to the anticipated direction or type of change; for instance, Wilcoxon scores for heavy tails, VdW for near-normality (Lombard et al., 2017).
Score-based CUSUM methodologies are now integral to rapid anomaly detection in statistical signal processing, high-dimensional machine learning models, and streaming functional data analysis, providing distributional robustness, nonparametric validity, and computational tractability across a wide range of change-point detection problems.