Papers
Topics
Authors
Recent
2000 character limit reached

Self-Normalized Mean Change Detection

Updated 24 December 2025
  • The paper introduces self-normalized test statistics that detect mean changes without explicit nuisance parameter estimation, ensuring robust inference.
  • The methodology extends to various data types including independent, dependent, high-dimensional, functional, and locally stationary processes.
  • The approach supports multiple change-point localization using recursive segmentation and simulation-calibrated thresholds for finite-sample accuracy.

Self-normalized extensions for mean changes constitute a class of inferential methodologies for detecting changes in the mean of sequential data, built around ratios of test statistics and variance estimators constructed from the data itself. These approaches enable strong theoretical guarantees and robust finite-sample performance, even in the absence of explicit estimation of nuisance parameters such as the long-run variance. The self-normalization framework has been systematically developed for independent and dependent univariate data, long-range dependent and locally stationary sequences, high-dimensional and functional data, and forms the backbone of multiple-segmentation algorithms in complex scenarios.

1. Foundational Self-Normalized Mean Change Tests

The self-normalized change-in-mean test introduced by Csörgő–Hut (Csorgo et al., 2013) provided a rigorous framework for detecting an at-most-one change in the mean of a sequence of independent observations X1,,XnX_1, \ldots, X_n with a common but unknown mean. The core construction is the maximal self-normalized deviation statistic,

Tn=max2kn2SkknSnk(nk)nVn(k),T_n = \max_{2\leq k\leq n-2} \frac{ | S_k - \frac{k}{n} S_n | }{ \sqrt{ \frac{k(n-k)}{n} }\, V_n(k) },

where SkS_k is the cumulative sum up to time kk, and Vn(k)V_n(k) is the within-segment pooled sample standard deviation,

Vn(k)=Uk,n2+Vk,n2.V_n(k) = \sqrt{ U_{k,n}^2 + V_{k,n}^2 }.

Here, Uk,n2U_{k,n}^2 and Vk,n2V_{k,n}^2 are unbiased sample variances before and after time kk. Under the null hypothesis (no change in mean), and under a finite EX2loglog(X+1)<\mathbb{E} X^2 \log\log(|X|+1) < \infty moment condition, TnT_n (suitably normalized) converges weakly to a standard Gumbel law: Pr(anTnbnx)exp(ex),\Pr(a_n T_n - b_n \leq x) \rightarrow \exp(-e^{-x}), with an=2loglogna_n = \sqrt{2\log\log n} and bn=2loglogn+12logloglogn12logπb_n = 2\log\log n + \frac12\log\log\log n - \frac12\log\pi. The same limit holds for infinite variance XiX_i in the domain of attraction of the normal law under regular variation conditions.

In the presence of a single mean change, TnT_n diverges in probability, implying strong consistency for any diverging rejection threshold cnc_n with cn=o(bn/an)c_n = o(b_n/a_n) (Csorgo et al., 2013).

2. Extensions to Dependent, Long-Range Dependent, and Locally Stationary Time Series

The classical self-normalized approach is well suited to independent or weakly dependent data. For strong dependence or local stationarity, modifications are necessary.

  • Long-Range Dependence: For Gaussian subordinated long-range dependent processes, self-normalized Wilcoxon-type tests are constructed via partial sums of ranks and suitably self-normalized statistics (Betken, 2014). This allows detection of mean changes where the serial dependence precludes straightforward estimation of the variance. The critical value is determined via simulations of Hermite processes (fractional Brownian motion if Hermite rank =1=1).
  • Locally Stationary Processes: In environments where the variance function σ2(t)\sigma^2(t) varies over time, classical factorization of the long-run variance fails (Heinrichs, 8 Sep 2025). Heinrichs developed a bivariate CUSUM-based approach relying on partial-sum arrays Sn(t,s)S_n(t,s) over blocks permuted sequentially, and constructs a test statistic TnT_n as a ratio of two suprema over different marginals of the process:

Tn=supsVn(s)supsHn(s).T_n = \frac{ \sup_s |V_n(s)| }{ \sup_s |H_n(s)| }.

This approach yields a pivotal limit law determined by suprema of independent Brownian motions, ensuring correct type I error control and power against broad alternatives, without the need for direct variance estimation (Heinrichs, 8 Sep 2025).

3. High Dimensional and Functional Extensions

  • High-Dimensional Data: For {Yt}t=1nRp\{Y_t\}_{t=1}^n \in \mathbb{R}^p with pp \to \infty, self-normalized tests are based on U-statistics of the Chen–Qin form for mean change, with normalization constructed from the sum of squared projected contrasts (Wang et al., 2019). Tests are formulated as maximizations over all cut points. In the dependent case, trimming is applied to reduce edge effects, and Monte Carlo calibration produces thresholds. These methods retain pivotality under the null and strong power for dense alternative settings. The wild binary segmentation wrapper allows multiple change-point localization.
  • Functional Data: For strictly stationary L2L^2-valued time series, relevant mean changes (i.e., L2L^2-distance between means exceeding Δ\Delta) are detected using self-normalized test statistics of the form Wn=(T^nΔ)/V^nW_n = (\hat{\mathbb{T}}_n - \Delta)/\hat{\mathbb{V}}_n, where T^n\hat{\mathbb{T}}_n is the empirical squared mean and V^n\hat{\mathbb{V}}_n is a self-normalizer integrating deviations of partial-sum curves over a grid. The resulting limit law is pivotal and critical values are obtained via simulation (Dette et al., 2018).

4. Multiple Change-Point Algorithms and Segmentation

To deal with multiple changes, self-normalized statistics are embedded in recursive segmentation algorithms:

  • SNCP Algorithm: The nested local-window self-normalization framework scans overlapping neighborhoods of candidate change-points and computes local statistics to obtain consistent estimates of both the number and locations of changes, regardless of dependence structure. For a segment [a,b][a, b], the maximum self-normalized CUSUM statistic is computed over all sufficiently sized local windows, and binary segmentation is applied recursively (Zhao et al., 2021).
  • Wild Binary Segmentation (WBS): High-dimensional SN change-point detection is extended via WBS, where self-normalized statistics are evaluated on random subintervals to locate breaks adaptively (Wang et al., 2019). Pivotal null distributions remain available via simulation.
Model/Data Type Statistic Structure Key Reference
i.i.d. Univariate Max self-normalized deviation over kk (Csorgo et al., 2013)
Long-range Dependent Max self-normalized rank/Wilcoxon (Betken, 2014)
Locally Stationary Bivariate CUSUM ratio statistic (Heinrichs, 8 Sep 2025)
High-dimensional U-statistics, trimmed SN norm (Wang et al., 2019)
Functional μL2\|\mu\|_{L^2} SN (partial-sum based) (Dette et al., 2018)
Segmentation/Multiple breaks Local-window/recursive SN CUSUM (Zhao et al., 2021)

5. Asymptotic Theory and Distributional Limits

The common feature across these methodologies is that the normalization is data-driven and constructed to asymptotically cancel unknown nuisance parameters (long-run variance, unknown scale), resulting in pivotal limit distributions under the null hypothesis:

Under alternatives (single or multiple changes), numerator effects diverge in probability while the normalizers remain OP(1)O_P(1), yielding consistency. Local alternatives, where the mean shift shrinks with nn, produce non-central limit processes and allow for explicit characterization of power properties.

6. Implementation, Calibration, and Finite-Sample Considerations

Self-normalized mean change tests lend themselves to tuning-free or simulation-based calibration:

  • No estimation of long-run variance or dependence parameters is needed.
  • All critical values are obtained via Monte Carlo simulation of the limiting processes (Brownian motion, Hermite processes, etc.), using sample-size-appropriate configurations (Betken, 2014, Heinrichs, 8 Sep 2025).
  • Trimming or windowing parameters in dependent or high-dimensional cases are set either by theory (e.g., exclusion of very early/late break points) or simple heuristics (e.g., fixed fractions of the sample).
  • Implementation overhead is moderate and scales linearly in nn in most cases; for high-dimensional or massively functional data, computational cost is dominated by matrix operations or repeated partial-sums.

Simulation studies consistently confirm type I error control, strong power for moderate-to-large changes, and robustness to serial dependence, heavy tails, and heteroscedasticity (Betken, 2014, Heinrichs, 8 Sep 2025, Zhao et al., 2021, Dette et al., 2018).

7. Extensions, Limitations, and Future Directions

Self-normalized testing provides a model-agnostic, theoretically grounded approach to mean change inference. Ongoing and potential extensions include:

Self-normalized extensions for mean changes have proven theoretically optimal and practically competitive across statistical change-point inference regimes, providing pivotal inference for complex data scenarios with minimal assumptions on the error process or dimensionality.

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Self-Normalized Extension for Mean Changes.