Greedy Online Change Point Detection

Updated 8 June 2026

GOCPD is a framework that greedily maximizes average log-likelihoods to detect abrupt distributional changes in streaming data.
It employs unimodal search strategies like ternary search and dynamic geometric grids to achieve O(log t) computational and storage efficiency.
The method integrates robust statistical tests and memory-reset mechanisms to control false alarms and minimize detection delays.

Greedy Online Change Point Detection (GOCPD) refers to a class of online, data-adaptive algorithms that detect abrupt distributional changes (“change points”) in streaming or sequentially observed data through localized, computationally efficient, typically likelihood-based criteria and grid-scanning or segmentation strategies. These procedures are characterized by the greedy maximization of evidence for a change—often by exploiting unimodality, strong log-likelihood contrasts, or CUSUM-like substructure—while maintaining low computational and storage costs suitable for high-throughput, high-dimensional, or ill-conditioned environments.

1. Formal Problem Setting and GOCPD Objective

The canonical GOCPD problem considers an infinite or streaming sequence of observations (possibly multivariate or structured), with an unknown change point $\tau$ at which the underlying data-generating process transitions from a pre-change regime (with distribution $P_1$ or parameter $\theta_1$ ) to a post-change regime ( $P_2$ or $\theta_2$ ). Denoting the observed data as $\mathcal{D} = \{Y_t\}_{t=1}^T$ , the task is to design an online stopping rule $\widehat\tau$ such that, with high confidence and low delay,

$\widehat\tau \approx \tau$ whenever a change occurs, and
false alarms—detections before the true $\tau$ or in the absence of change—are controlled at a nominal level.

In the “greedy” GOCPD paradigm, at each time $t$ , a candidate split point $P_1$ 0 is identified by maximizing a criterion favoring segmentation into two independent regimes. For example, one widely used objective is

$P_1$ 1

where $P_1$ 2 denotes the average log-likelihood under maximum likelihood parameters fitted to respective segments, and $P_1$ 3 is the last detected change point (Ho et al., 2023). This framework naturally subsumes classical CUSUM, likelihood-ratio, and residual-based tests, but is operationalized online through computationally efficient greedy search or grid-based scanning.

2. Algorithmic Structure and Computational Efficiency

GOCPD methods achieve efficiency and online deployment primarily through (a) greedy/local scans over a dynamically maintained set of candidate change points, (b) judicious use of summary statistics and recurrence relations, and (c) objective functions admitting (piecewise) unimodality, enabling accelerated search.

Greedy Search by Unimodality

For univariate or multivariate time series with a single change, the GOCPD objective is typically unimodal in the candidate index $P_1$ 4—a formal property established in [(Ho et al., 2023), Proposition 1]. This property allows the change point search to be performed via ternary search, reducing per-step computational cost from $P_1$ 5 to $P_1$ 6.

Dynamic Geometric Grid

For large-scale or high-frequency settings, (Moen, 13 Apr 2025) proposes maintaining and updating a dynamically selected geometric grid $P_1$ 7 of candidate offsets, with $P_1$ 8, guaranteeing that for any true jump $P_1$ 9, there exists a grid point close to $\theta_1$ 0. This enables grid-scanning CUSUM or likelihood-type tests to be performed in $\theta_1$ 1 time and space:

At each $\theta_1$ 2, all sufficient summaries or statistics for $\theta_1$ 3 are incrementally updated.
For each $\theta_1$ 4, a test statistic $\theta_1$ 5 is computed; detection is triggered if any $\theta_1$ 6 crosses a threshold.

Memory and Storage

For high-dimensional data, summaries (such as partial sums or outer products) are only maintained for $\theta_1$ 7 relevant intervals, and many GOCPD architectures exploit tail-length or excitationset sparsity to further reduce redundancy (Chen et al., 2020, Moen, 13 Apr 2025).

3. Test Statistics, Aggregation, and Robustification

GOCPD platforms support a wide variety of change detection statistics, including but not limited to:

Likelihood-ratio and average log-likelihood–based scores (Ho et al., 2023)
Multiscale, coordinate-wise CUSUM or CUSUM-like statistics and their cross-coordinate aggregations (for high-dimensions or signals of unknown sparsity) (Chen et al., 2020)
Covariance, operator-norm, or residual-based metrics for detecting parameter or structural breaks (Moen, 13 Apr 2025, Leung et al., 2024)

Robustification is achieved via post-split statistical confirmation, such as Mahalanobis screening of left- and right-segment residuals to guard against outlier-induced false positives. Empirically, these outlier guards reduce the false discovery rate (FDR) significantly in both synthetic and real-world benchmarks (Ho et al., 2023).

4. Theoretical Guarantees and Performance Metrics

GOCPD schemes are designed with explicit statistical and computational guarantees:

False Alarm Control: The probability of detecting a change before a true change (or in its absence), $\theta_1$ 8, is controlled at level $\theta_1$ 9 for user-specified $P_2$ 0 (Moen, 13 Apr 2025, Chen et al., 2020).
Detection Delay: For sufficiently large change magnitude $P_2$ 1 (or appropriate high-dimensional analogs), the expected delay $P_2$ 2 is provably $P_2$ 3, which matches information-theoretic minimax lower bounds up to logarithmic factors (Moen, 13 Apr 2025).
Computational Cost: Update and storage cost per time step is $P_2$ 4 for scalar/low-dimensional tests, and $P_2$ 5 or better for multivariate mean and covariance detection (Moen, 13 Apr 2025, Chen et al., 2020).
Empirical Performance: On real and synthetic datasets, GOCPD yields true positive rates (TPR) in the range $P_2$ 6– $P_2$ 7 and positive predictive values (PPV) $P_2$ 8– $P_2$ 9, outperforming or matching established baselines in FDR and runtime (Ho et al., 2023).

5. Extensions: Greedy Excitation in System Identification

In adaptive and system identification settings with time-varying parameters, GOCPD is paired with greedy excitation-set selection for robust recursive least squares (RLS) updates. The key innovation is to maintain an online "greedy excitation set" $\theta_2$ 0: newly acquired regressors are admitted if and only if they do not worsen the Hessian condition number (Leung et al., 2024). The parameter update at each step then uses a two-tier weighting—retaining informative past data and exponentially forgetting the rest—leading to improved tracking and bias-variance control.

An embedded GOCPD change point detector, based on EWMA-filtered model residuals and a likelihood-ratio test for jump-induced miss distributions, triggers a memory reset, discarding obsolete historical data and reinitializing the model post-jump for rapid reacquisition of new dynamics. This memory-resetting is provably optimal under likelihood tests and preserves adaptivity in ill-conditioned regimes.

6. Multiscale and High-Dimensional Frameworks

GOCPD architectures are designed to perform efficient detection in both low- and high-dimensional regimes. For high-dimensional Gaussian streams, multiscale likelihood-ratio and CUSUM-like statistics are computed over dyadic grids for each coordinate, yielding aggregation strategies (e.g., $\theta_2$ 1-hard-thresholded sums) that adapt to varying sparsity levels and unknown signal strengths (Chen et al., 2020, Moen, 13 Apr 2025). All core formulas—per-coordinate scan statistics, aggregation, and stopping rules—are updated greedily with minimal memory and are implemented in software such as the R package 'ocd'.

The general methodology allows any offline scan statistic (CUSUM, LR, operator-norm, etc.) to be dynamically embedded into a streaming GOCPD architecture, transforming classical global scans into locally greedy, real-time detectors.

7. Practical Implementation and Applications

Published implementations emphasize the following features:

Maintenance of summary statistics and candidate grids with $\theta_2$ 2 amortized time and space (Moen, 13 Apr 2025)
Flexible modeling choices (Gaussian, GP, or regression models) according to domain requirements (Ho et al., 2023, Leung et al., 2024)
Outlier-robust postprocessing and statistical threshold calibration via theoretical/empirical procedures
Empirically validated performance in applications ranging from high-frequency market data, EEG seizure detection, and activity monitoring to online adaptive regression in time-varying systems (Ho et al., 2023, Moen, 13 Apr 2025, Chen et al., 2020, Leung et al., 2024)

A tabular synopsis of GOCPD methodologies drawn from key references follows:

Methodology Source	Core Idea	Per-step Complexity
(Ho et al., 2023)	Unimodal loglikelihood maximization + ternary search	$\theta_2$ 3
(Moen, 13 Apr 2025)	Dynamic geometric scan grid, any offline statistic	$\theta_2$ 4– $\theta_2$ 5
(Chen et al., 2020)	Multiscale, coordinatewise greedy CUSUM + aggregation	$\theta_2$ 6
(Leung et al., 2024)	Greedy excitation-set RLS + LR change detection	$\theta_2$ 7

The empirical and theoretical analyses collectively establish GOCPD as an efficient, versatile, and statistically rigorous framework for rapid, robust change-point detection under streaming constraints and challenging data regimes.

Markdown Report Issue Upgrade to Chat

References (4)

Greedy online change point detection (2023)

A general methodology for fast online changepoint detection (2025)

High-dimensional, multiscale online changepoint detection (2020)

Online Identification of Time-Varying Systems Using Excitation Sets and Change Point Detection (2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Greedy Online Change Point Detection (GOCPD).

Greedy Online Change Point Detection

1. Formal Problem Setting and GOCPD Objective

2. Algorithmic Structure and Computational Efficiency

Greedy Search by Unimodality

Dynamic Geometric Grid

Memory and Storage

3. Test Statistics, Aggregation, and Robustification

4. Theoretical Guarantees and Performance Metrics

5. Extensions: Greedy Excitation in System Identification

6. Multiscale and High-Dimensional Frameworks

7. Practical Implementation and Applications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Greedy Online Change Point Detection

1. Formal Problem Setting and GOCPD Objective

2. Algorithmic Structure and Computational Efficiency

Greedy Search by Unimodality

Dynamic Geometric Grid

Memory and Storage

3. Test Statistics, Aggregation, and Robustification

4. Theoretical Guarantees and Performance Metrics

5. Extensions: Greedy Excitation in System Identification

6. Multiscale and High-Dimensional Frameworks

7. Practical Implementation and Applications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research