Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Deep Change Point Detectors

Updated 30 June 2025
  • Deep change point detectors are algorithms that leverage neural architectures and kernel methods to identify abrupt distributional shifts in complex stochastic processes.
  • They extend traditional methods like CUSUM to accommodate high-dimensional, non-i.i.d., and nonstationary data, including jump processes in Lévy models.
  • These detectors offer robust, online change detection with minimax optimality, finding applications in finance, insurance, industrial monitoring, and cybersecurity.

Deep change point detectors are algorithms and statistical methods that leverage the representational power of deep learning or modern nonparametric techniques to detect abrupt changes in the distributional properties of stochastic processes and time series. These methods extend classical paradigms—such as likelihood-ratio tests and CUSUM procedures—to accommodate non-i.i.d., high-dimensional, or nonstationary data, and often incorporate neural architectures, kernel methods, or latent-space learning to capture complex generative mechanisms, including those with jumps, structural dependencies, or intricate feature interactions.

1. CUSUM Procedures for Lévy Processes

The CUSUM (Cumulative Sum) procedure is foundational in sequential change-point detection, monitoring cumulative (log-)likelihood ratios to register an abrupt change in the statistical law of an observed process. The approach is well-established for discrete-time, i.i.d. processes but is extended to continuous-time Lévy processes—which encompass both continuous-path motions (e.g., Brownian motion) and jump processes (e.g., Poisson process)—by directly considering the likelihood structure under pre- and post-change regimes.

For a Lévy process XtX_t with change at TT from law P0P^0 to P1P^1, define the likelihood ratio process as

Lt=dP1FtdP0Ft,L_t = \frac{dP^1|_{\mathcal{F}_t}}{dP^0|_{\mathcal{F}_t}},

where Ft\mathcal{F}_t is the natural filtration. The CUSUM statistic in log-form is

Yt=Utinf0stUs,Y_t = U_t - \inf_{0 \leq s \leq t} U_s,

where Ut=logLtU_t = \log L_t. The stopping rule is

Th=inf{t0:Ytlogh},T_h = \inf\{t \geq 0: Y_t \geq \log h\},

with threshold hh tuned to balance detection delay and false alarms.

When direct continuous-time monitoring is impractical—as with nearly all applications—CUSUM is implemented via discretization, with recursive update

Sk=max(Sk1,1)Lk,S_k = \max(S_{k-1}, 1) \cdot L_k,

where LkL_k is the likelihood ratio for the kk-th increment. The CUSUM method supports efficient, online implementation and is robust to high-frequency discretization.

2. Fundamental Properties of Lévy Processes and Detection Relevance

Lévy processes are right-continuous, stationary-increment, and independent-increment stochastic processes capable of modeling sudden jumps. Typical applications arise in finance (to model returns with jumps or stochastic volatility), insurance (claims arrival or size processes), and generally in risk management for systems prone to abrupt, rare events.

The chief challenge in change-point detection in Lévy processes is the jump behavior and ensuing non-Gaussianity, which can confound techniques relying on moment assumptions or continuity. Proper likelihood-based detection thus requires careful construction of likelihood ratios or their surrogates, fully exploiting the independent and stationary increment properties.

3. Minimax Optimality in Lorden's Sense

Lorden's minimax optimality criterion defines the gold standard for real-time change detection: it requires minimizing the worst-case expected detection delay, uniformly over all possible change-point times and sample paths, subject to a constraint on the average (false) alarm rate under the null regime. Specifically,

dc(T):=supT0ess supET[(TT)+FT],d^c(T) := \sup_{T \geq 0} \operatorname{ess~sup} E_T[(T - T)^+ | \mathcal{F}_T],

with

infTTydc(T),Ty={T:E[T]y}.\inf_{T \in \mathcal{T}_y} d^c(T), \quad \mathcal{T}_y = \{T: E_\infty[T] \geq y\}.

For Lévy processes, the main result shows the CUSUM stopping rule ThT_h is minimax optimal in this sense, i.e., it minimizes the maximal delay under a fixed false alarm constraint. Optimality is established via:

  • Approximation with discrete-time problems (equispaced grid), showing discrete CUSUM is optimal (Proposition 3.6),
  • Passing to the continuous-time limit,
  • Martingale and optimal stopping arguments (Snell envelopes),
  • Equalizer property: CUSUM achieves constant worst-case delay for all change-point times.

This establishes CUSUM's unique suitability for practical, robust operation in the continuous-time context of Lévy processes.

4. Discretization, Approximation, and Practical Implementation

The theoretical optimality of CUSUM is both proved and enabled by discretization:

  • Change points are restricted to a discrete grid (TΔZ+T \in \Delta \mathbb{Z}_+).
  • The CUSUM statistic updates recursively using only local likelihood increments.
  • As the grid is refined (Δ0\Delta \to 0), discrete procedures approximate the continuous case with vanishing error.

For real data, which is always sampled at a finite resolution, this means CUSUM is directly applicable with high-frequency data, and its performance is stable across a range of sampling intervals. Moreover, the method is suited both to scenarios where changes are permissible only at specific (e.g., daily, hourly) times and where grid-based modeling is more natural.

Recursive implementation,

1
2
3
S_k = max(S_{k-1}, 1) * L_k
if S_k >= h:
    alarm()
supports streaming operation with minimal computational overhead.

5. Comparison with Alternative Change-Point Detection Methods

Bayesian Methods (e.g., Shiryaev–Roberts test) introduce priors for the change-point and focus on the expected delay rather than the worst-case. While sometimes optimal under different risk criteria, they may be computationally intensive and require knowledge of prior distributions.

Nonparametric Methods such as empirical tail integrals or likelihood ratios offer robustness to model misspecification but typically forfeit power (delay optimality) when pre/post-change laws are known. They may be necessary when model parameters are unknown or when data is not amenable to parametric modeling.

Exponentially Weighted Schemes and Shewhart rules may be useful for some objectives (nonlinear penalties, large deviations) but are suboptimal under Lorden’s linear penalty criterion.

Advantages of CUSUM: minimax delay optimality, equalizer property, recursive updates, and robust performance on discretely and continuously sampled observations. Limitations: requirement for explicit knowledge of pre- and post-change distributions, and suboptimality under penalties other than Lorden’s.

6. Applications and Research Directions

CUSUM for Lévy processes is applied to:

  • Finance: Regime switches and risk detection in asset prices,
  • Insurance: Monitoring for abrupt changes in claim rates,
  • Industrial monitoring: Fault and anomaly detection in jump-driven signals,
  • Environmental monitoring: Earthquake or extreme event detection,
  • Cybersecurity: Detecting impulsive anomalies in network traffic.

Promising research avenues include:

  • Extending CUSUM's optimality to nonlinear delay penalties (exponential, etc.),
  • Fully nonparametric and robust extensions that relax requirements on the Lévy triplet,
  • Generalization to models with dependent increments (e.g., Hawkes processes),
  • Integration with deep learning for representation learning in multivariate or nonstationary data,
  • Deployment in high-frequency, high-dimensional, or streaming settings,
  • Online adaptation to unknown or time-varying pre/post-change distributions.

Aspect CUSUM for Lévy Processes
Change-point statistic St=sup0stLt/LsS_t = \sup_{0 \leq s \leq t} L_t / L_s, or Yt=UtinfUsY_t = U_t - \inf U_s
Stopping Rule Th=inf{t0:Sth}T_h = \inf\{ t \geq 0: S_t \geq h \}
Optimality Minimax (Lorden): worst-case delay minimized given false alarm constraint
Methodological basis Discrete approximation, Markovian optimal stopping, Snell envelope
Applications Finance, insurance, engineering, cyber-physical systems, biology
Comparisons Outperforms threshold/Bayesian/ad hoc in minimax sense; requires model params
Future directions Robustness, nonlinear penalties, deep learning, online/high-frequency contexts

The rigorous development and analysis presented establish the CUSUM procedure as the minimax optimal change-point detector for Lévy processes in continuous time, underscore its direct implementability via discretization, and delineate avenues for extending such methods to more complex or data-rich environments, including those leveraging modern deep learning representations and adaptive strategies.