Deep Change Point Detectors
- Deep change point detectors are algorithms that leverage neural architectures and kernel methods to identify abrupt distributional shifts in complex stochastic processes.
- They extend traditional methods like CUSUM to accommodate high-dimensional, non-i.i.d., and nonstationary data, including jump processes in Lévy models.
- These detectors offer robust, online change detection with minimax optimality, finding applications in finance, insurance, industrial monitoring, and cybersecurity.
Deep change point detectors are algorithms and statistical methods that leverage the representational power of deep learning or modern nonparametric techniques to detect abrupt changes in the distributional properties of stochastic processes and time series. These methods extend classical paradigms—such as likelihood-ratio tests and CUSUM procedures—to accommodate non-i.i.d., high-dimensional, or nonstationary data, and often incorporate neural architectures, kernel methods, or latent-space learning to capture complex generative mechanisms, including those with jumps, structural dependencies, or intricate feature interactions.
1. CUSUM Procedures for Lévy Processes
The CUSUM (Cumulative Sum) procedure is foundational in sequential change-point detection, monitoring cumulative (log-)likelihood ratios to register an abrupt change in the statistical law of an observed process. The approach is well-established for discrete-time, i.i.d. processes but is extended to continuous-time Lévy processes—which encompass both continuous-path motions (e.g., Brownian motion) and jump processes (e.g., Poisson process)—by directly considering the likelihood structure under pre- and post-change regimes.
For a Lévy process with change at from law to , define the likelihood ratio process as
where is the natural filtration. The CUSUM statistic in log-form is
where . The stopping rule is
with threshold tuned to balance detection delay and false alarms.
When direct continuous-time monitoring is impractical—as with nearly all applications—CUSUM is implemented via discretization, with recursive update
where is the likelihood ratio for the -th increment. The CUSUM method supports efficient, online implementation and is robust to high-frequency discretization.
2. Fundamental Properties of Lévy Processes and Detection Relevance
Lévy processes are right-continuous, stationary-increment, and independent-increment stochastic processes capable of modeling sudden jumps. Typical applications arise in finance (to model returns with jumps or stochastic volatility), insurance (claims arrival or size processes), and generally in risk management for systems prone to abrupt, rare events.
The chief challenge in change-point detection in Lévy processes is the jump behavior and ensuing non-Gaussianity, which can confound techniques relying on moment assumptions or continuity. Proper likelihood-based detection thus requires careful construction of likelihood ratios or their surrogates, fully exploiting the independent and stationary increment properties.
3. Minimax Optimality in Lorden's Sense
Lorden's minimax optimality criterion defines the gold standard for real-time change detection: it requires minimizing the worst-case expected detection delay, uniformly over all possible change-point times and sample paths, subject to a constraint on the average (false) alarm rate under the null regime. Specifically,
with
For Lévy processes, the main result shows the CUSUM stopping rule is minimax optimal in this sense, i.e., it minimizes the maximal delay under a fixed false alarm constraint. Optimality is established via:
- Approximation with discrete-time problems (equispaced grid), showing discrete CUSUM is optimal (Proposition 3.6),
- Passing to the continuous-time limit,
- Martingale and optimal stopping arguments (Snell envelopes),
- Equalizer property: CUSUM achieves constant worst-case delay for all change-point times.
This establishes CUSUM's unique suitability for practical, robust operation in the continuous-time context of Lévy processes.
4. Discretization, Approximation, and Practical Implementation
The theoretical optimality of CUSUM is both proved and enabled by discretization:
- Change points are restricted to a discrete grid ().
- The CUSUM statistic updates recursively using only local likelihood increments.
- As the grid is refined (), discrete procedures approximate the continuous case with vanishing error.
For real data, which is always sampled at a finite resolution, this means CUSUM is directly applicable with high-frequency data, and its performance is stable across a range of sampling intervals. Moreover, the method is suited both to scenarios where changes are permissible only at specific (e.g., daily, hourly) times and where grid-based modeling is more natural.
Recursive implementation,
1 2 3 |
S_k = max(S_{k-1}, 1) * L_k if S_k >= h: alarm() |
5. Comparison with Alternative Change-Point Detection Methods
Bayesian Methods (e.g., Shiryaev–Roberts test) introduce priors for the change-point and focus on the expected delay rather than the worst-case. While sometimes optimal under different risk criteria, they may be computationally intensive and require knowledge of prior distributions.
Nonparametric Methods such as empirical tail integrals or likelihood ratios offer robustness to model misspecification but typically forfeit power (delay optimality) when pre/post-change laws are known. They may be necessary when model parameters are unknown or when data is not amenable to parametric modeling.
Exponentially Weighted Schemes and Shewhart rules may be useful for some objectives (nonlinear penalties, large deviations) but are suboptimal under Lorden’s linear penalty criterion.
Advantages of CUSUM: minimax delay optimality, equalizer property, recursive updates, and robust performance on discretely and continuously sampled observations. Limitations: requirement for explicit knowledge of pre- and post-change distributions, and suboptimality under penalties other than Lorden’s.
6. Applications and Research Directions
CUSUM for Lévy processes is applied to:
- Finance: Regime switches and risk detection in asset prices,
- Insurance: Monitoring for abrupt changes in claim rates,
- Industrial monitoring: Fault and anomaly detection in jump-driven signals,
- Environmental monitoring: Earthquake or extreme event detection,
- Cybersecurity: Detecting impulsive anomalies in network traffic.
Promising research avenues include:
- Extending CUSUM's optimality to nonlinear delay penalties (exponential, etc.),
- Fully nonparametric and robust extensions that relax requirements on the Lévy triplet,
- Generalization to models with dependent increments (e.g., Hawkes processes),
- Integration with deep learning for representation learning in multivariate or nonstationary data,
- Deployment in high-frequency, high-dimensional, or streaming settings,
- Online adaptation to unknown or time-varying pre/post-change distributions.
Aspect | CUSUM for Lévy Processes |
---|---|
Change-point statistic | , or |
Stopping Rule | |
Optimality | Minimax (Lorden): worst-case delay minimized given false alarm constraint |
Methodological basis | Discrete approximation, Markovian optimal stopping, Snell envelope |
Applications | Finance, insurance, engineering, cyber-physical systems, biology |
Comparisons | Outperforms threshold/Bayesian/ad hoc in minimax sense; requires model params |
Future directions | Robustness, nonlinear penalties, deep learning, online/high-frequency contexts |
The rigorous development and analysis presented establish the CUSUM procedure as the minimax optimal change-point detector for Lévy processes in continuous time, underscore its direct implementability via discretization, and delineate avenues for extending such methods to more complex or data-rich environments, including those leveraging modern deep learning representations and adaptive strategies.