Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 175 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 38 tok/s Pro
GPT-5 High 37 tok/s Pro
GPT-4o 108 tok/s Pro
Kimi K2 180 tok/s Pro
GPT OSS 120B 447 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Min-SCUSUM: Sequential Change Detection

Updated 13 November 2025
  • Min-SCUSUM is a sequential detection algorithm that uses differences in Hyvärinen scores to identify abrupt changes in multistream data without needing normalization constants.
  • The method generalizes classical CUSUM by replacing log-likelihood increments with score differences, ensuring controlled false alarm rates and asymptotically optimal detection delays via Fisher divergence.
  • It is applicable to high-dimensional and energy-based models, offering robust performance even when traditional likelihood-based methods are infeasible.

The min-SCUSUM method is a sequential detection and diagnosis algorithm designed for multistream quickest change detection under settings where explicit likelihood ratios are infeasible or undesirable. By relying on the Hyvärinen score and Fisher divergence, min-SCUSUM generalizes the classical CUSUM/Min-CuSum framework to unnormalized statistical models, enabling effective and provably optimal performance in high-dimensional and energy-based contexts (Warner et al., 2023, Wu et al., 2023, Chen et al., 6 Nov 2025).

1. Theoretical Foundations

Min-SCUSUM extends the concept of online sequential analysis to the regime of multiple independent data streams, each of which may undergo an abrupt distributional change at an unknown time. The key innovation is the replacement of log-likelihood increments with differences of Hyvärinen scores:

  • Given MM parallel streams, the goal is to detect as quickly as possible when any stream transitions from a “pre-change” density pp to a stream-specific “post-change” density qiq_i, without requiring normalization constants.
  • The Hyvärinen score for a twice-differentiable density r(x)r(x) on Rd\mathbb{R}^d is defined as SH(x;r)=12xlogr(x)22+Δxlogr(x)S_H(x; r) = \frac{1}{2}\|\nabla_x \log r(x)\|_2^2 + \Delta_x \log r(x), where Δx\Delta_x is the Laplacian.
  • The Fisher divergence, DF(pq)=Ep[12logp(X)logq(X)22]D_F(p\|q) = \mathbb{E}_p[\frac{1}{2}\|\nabla \log p(X) - \nabla \log q(X)\|_2^2], quantifies the separation between distributions and governs the algorithm’s asymptotic detection delay.
  • For each stream ii, the instantaneous increment at time tt is dt(i)=SH(Xi,t;p)SH(Xi,t;qi)d_t^{(i)} = S_H(X_{i,t}; p) - S_H(X_{i,t}; q_i), guaranteeing a negative drift under the null (pp) and a positive drift after the change to qiq_i.

This approach is strictly proper and scale-invariant, circumventing normalization by relying only on gradients and Laplacians of the (potentially unnormalized) log-densities.

2. Algorithm Definition and Workflow

The min-SCUSUM algorithm operates with MM parallel detection statistics, each recursively updating a CUSUM-like statistic:

  • For each stream i=1,,Mi=1, \dots, M, initialize W0(i)=0W_0^{(i)}=0.
  • At each time tt, for stream ii:

    1. Compute dt(i)=SH(Xi,t;p)SH(Xi,t;qi)d_t^{(i)} = S_H(X_{i,t}; p) - S_H(X_{i,t}; q_i).
    2. Update Wt(i)=max{0,Wt1(i)+dt(i)}W_t^{(i)} = \max \{ 0, W_{t-1}^{(i)} + d_t^{(i)} \}.
  • Fix a threshold b>0b > 0.

  • Define stopping times Ti(b)=inf{t1:Wt(i)b}T_i(b) = \inf\{ t \geq 1 : W_t^{(i)} \geq b \} for each stream and T(b)=min1iMTi(b)T(b) = \min_{1 \leq i \leq M} T_i(b).
  • At time T(b)T(b), declare a change and diagnose the altered stream via D=argmax1iMWT(b)(i)D = \arg\max_{1 \le i \le M} W_{T(b)}^{(i)}.

Min-SCUSUM Pseudocode

1
2
3
4
5
6
7
8
9
10
11
for i in range(1, M+1):
    W[i] = 0
for t in count(1):
    observe X_t = (X_{1,t}, ..., X_{M,t})
    for i in range(1, M+1):
        d = S_H(X_{i,t}; p) - S_H(X_{i,t}; q_i)
        W[i] = max(0, W[i] + d)
    if any(W[i] >= b for i in range(1, M+1)):
        T = t
        D = argmax(W)
        break  # declare change at time T in stream D

This architecture requires only the computation of gradients and Laplacians for SHS_H and updates all statistics in parallel using vectorized and efficient code.

3. Performance Guarantees

3.1 False Alarm Control

Under the no-change regime (all streams distributed as pp), the mean time to the first false alarm satisfies: E[T(b)]ebM\mathbb{E}_\infty[T(b)] \geq \frac{e^b}{M} Thus, setting b=log(M/α)b = \log(M/\alpha) ensures E[T]1/α\mathbb{E}_\infty[T] \geq 1/\alpha for any desired false-alarm rate α\alpha (Chen et al., 6 Nov 2025).

3.2 Asymptotic Detection Delay

When a change occurs in stream ii at time $0$,

Ei[T(b)]bDF(qip),as b\mathbb{E}_i[T(b)] \sim \frac{b}{D_F(q_i \| p)}, \quad \text{as } b \to \infty

For b=log(M/α)b = \log(M/\alpha), the worst-case delay (Lorden’s criterion)

supνesssupEν,i[TνT>ν]log(M/α)DF(qip)\sup_\nu \mathrm{ess\,sup}\, \mathbb{E}_{\nu, i}[ T - \nu \mid T > \nu ] \sim \frac{ \log(M/\alpha) }{ D_F(q_i \| p) }

Similar results hold for the Kullback–Leibler-based Min-CuSum, replacing DFD_F with KL(fif0)\mathrm{KL}(f_i\|f_0) (Warner et al., 2023, Wu et al., 2023).

3.3 Misidentification Probability

The probability of misdiagnosis—declaring ii instead of the true changed stream jj—is exponentially controlled: Pν,j(D=iT>ν)eb(1+b)(1+1DF(qjp)+ζij(b))P_{\nu,j}(D=i \mid T>\nu) \leq e^{-b}(1+b)\Bigl(1+\frac{1}{D_F(q_j\|p)}+\zeta_{ij}(b)\Bigr) where ζij(b)0\zeta_{ij}(b)\to 0 as bb\to\infty. With b=log(M/α)b = \log(M/\alpha), the misidentification rate decays as O((1+log(M/α))α)O((1+\log(M/\alpha))\alpha) (Chen et al., 6 Nov 2025).

Summary Table

Parameter Min-SCUSUM Control Asymptotic Behavior
False Alarm Rate b=log(M/α)b = \log(M/\alpha) E[T]1/α\mathbb{E}_\infty[T]\geq 1/\alpha
Detection Delay b/DF(qip)\sim b/D_F(q_i\|p) log(M/α)/DF(qip)\sim \log(M/\alpha)/D_F(q_i\|p)
Misidentification O((1+b)eb)O((1+b)e^{-b}) O((1+log(M/α))α)O((1+\log(M/\alpha))\alpha)

4. Comparison to Likelihood-Based Approaches

Traditional Min-CuSum relies on explicit log-likelihood ratios, yielding

Si(n)=max1knt=kni(t),i(t)=logfi(Xt)f0(Xt)S_i(n) = \max_{1 \leq k \leq n} \sum_{t=k}^{n} \ell_i(t),\quad \ell_i(t) = \log \frac{f_i(X_t)}{f_0(X_t)}

However, such approaches are impractical for unnormalized densities or intractable partition functions. Min-SCUSUM instead requires only access to the unnormalized log-density and its derivatives, greatly expanding the class of models (e.g., energy-based models, RBMs, diffusion models) amenable to rigorous sequential change detection and diagnosis.

Theoretically, both approaches admit first-order asymptotic optimality under Lorden’s delay criterion when calibrated to ensure the false alarm and misidentification constraints (Warner et al., 2023, Chen et al., 6 Nov 2025).

5. Practical Estimation and Implementation Considerations

Estimating the required score functions (logp,logqi)(\nabla \log p, \nabla \log q_i) can be accomplished via:

  • Score-matching (Hyvärinen, 2005): Empirically minimize Fisher divergence to fit parametric models for logqθ(x)\nabla \log q_\theta(x) using observed samples.
  • Deep score network approaches: Denoising-diffusion training (Song & Ermon, NeurIPS'19) enables accurate approximation of score functions in high dimensions.

Once estimators for the required gradients and Laplacians are obtained, the running cost per sample is O(dM)O(dM), where dd is the data dimensionality.

Calibration of the threshold bb is governed by the desired false-alarm rate and number of channels, and all performance guarantees hold uniformly in the change-point.

6. Empirical and Application Evidence

Simulation studies on multidimensional Gaussian models and energy-based models including Gauss–Bernoulli RBMs corroborate theory:

  • The detection delay tracks the b/DFb/D_F law, and misidentification remains bounded below the exponential theoretical upper bound.
  • Experiments document that a threshold as per the analytic formula yields empirical false alarm rates and delays as predicted.
  • In real-world applications such as video anomaly detection across multiple camera streams, min-SCUSUM identifies altered streams with high reliability even when neither pre- nor post-change densities are normalized (Chen et al., 6 Nov 2025).

A plausible implication is that min-SCUSUM enables rigorous sequential change diagnosis in modern high-dimensional settings where the likelihood-based CuSum approaches are computationally infeasible or ill-defined.

7. Connections, Limitations, and Extensions

Min-SCUSUM is directly linked to the general class of proper scoring rules, with the Hyvärinen score being a special choice that confers tractability for unnormalized models.

The method relies on the assumption that score functions are differentiable and Laplacians well-defined, which may restrict its application in discrete or degenerate models. For misspecified models, or when only approximate score estimators are available, the exponential tail bounds on misidentification error and the minimality of delay require empirical verification.

Future directions include adaptation to asynchronous changes, nonparametric score estimation, and rigorous finite-sample performance guarantees in highly misspecified settings. The framework remains extensible to more complex structural change regimes as long as the score differences retain negative and positive drift properties pre- and post-change (Chen et al., 6 Nov 2025).

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Min-SCUSUM Method.