Papers
Topics
Authors
Recent
Search
2000 character limit reached

Self-Normalised Martingales

Updated 30 July 2025
  • Self-normalised martingales are stochastic processes that divide cumulative sums by intrinsic, data-dependent measures like quadratic variation.
  • They enable sharp concentration inequalities and deviation bounds, balancing Gaussian-like and heavy-tail behavior through adaptive normalization.
  • They are pivotal in sequential analysis and online learning, providing instance-adaptive confidence sequences and robust inference in high-dimensional problems.

A self-normalised martingale is a stochastic process in which the “normalisation”—the scale or variance by which the process is divided—is an intrinsic, data-dependent (and usually increasing) random process constructed from the martingale itself, typically its predictable or total quadratic variation. This adaptive approach leads to distributional approximations, concentration inequalities, and deviation bounds that are pivotal in sequential analysis, statistical inference, and high-dimensional probability, particularly when uncertainty or heteroscedasticity precludes deterministic normalization.

1. Definition, Structure, and Fundamental Quantities

Let (Xi,Fi)(X_i,\mathcal{F}_i) be a sequence of martingale differences with respect to an increasing filtration. Construct the (vector- or Hilbert-space–valued) martingale

Sn=i=1nXi.S_n = \sum_{i=1}^n X_i.

A self-normalised martingale is a process of the form Sn/NnS_n / N_n or, more generally, for NnN_n defined as a random (predictable or observable) measure of scale such as

  • total quadratic variation [S]n=i=1nXi2[S]_n = \sum_{i=1}^n X_i^2 (real-valued case),
  • predictable quadratic variation Sn=i=1nE[Xi2Fi1]\langle S \rangle_n = \sum_{i=1}^n \mathbb{E}[X_i^2|\mathcal{F}_{i-1}],
  • or, in the infinite-dimensional setting, (Sn+ρI)12Sn(\langle S \rangle_n + \rho I)^{-\frac{1}{2}} S_n for XiX_i in a Hilbert space.

For instance, the classical Student's t-statistic is expressible as a self-normalised sum,

Tn=Sn[S]n,T_n = \frac{S_n}{\sqrt{[S]_n}},

where Sn=i=1nXiS_n = \sum_{i=1}^n X_i and Sn=i=1nXi.S_n = \sum_{i=1}^n X_i.0.

Self-normalisation is adaptive: Sn=i=1nXi.S_n = \sum_{i=1}^n X_i.1 encodes the realised variability and enables tight control even when the (possibly random) variance or scale is unknown or non-constant.

2. Concentration and Deviation Inequalities

Bernstein-type and related exponential inequalities for self-normalised martingales have been developed to provide both Gaussian-like (sub-exponential) and heavy-tail deviation rates depending on moment and symmetry assumptions.

For real-valued martingale differences Sn=i=1nXi.S_n = \sum_{i=1}^n X_i.2 with Sn=i=1nXi.S_n = \sum_{i=1}^n X_i.3 and Sn=i=1nXi.S_n = \sum_{i=1}^n X_i.4,

Sn=i=1nXi.S_n = \sum_{i=1}^n X_i.5

for all Sn=i=1nXi.S_n = \sum_{i=1}^n X_i.6 (Fan et al., 2018). This inequality smoothly interpolates between sub-Gaussian and exponential regimes, extending classical results to cases with only a lower bound on martingale differences. When centering and symmetry (conditionally symmetric differences) are available, even sharper Gaussian-type inequalities are established (Fan et al., 2018).

For vector- or Hilbert-space–valued settings, recent results achieve dimension-free Bernstein inequalities for self-normalised martingales (Akhavan et al., 28 Jul 2025). Let Sn=i=1nXi.S_n = \sum_{i=1}^n X_i.7 with Sn=i=1nXi.S_n = \sum_{i=1}^n X_i.8, Sn=i=1nXi.S_n = \sum_{i=1}^n X_i.9, and predictable quadratic variation Sn/NnS_n / N_n0. The new tail bound is: Sn/NnS_n / N_n1 for an absolute constant Sn/NnS_n / N_n2, a complexity parameter Sn/NnS_n / N_n3, and a slowly varying correction Sn/NnS_n / N_n4 (Akhavan et al., 28 Jul 2025).

Dimension-free means these bounds do not depend explicitly on the ambient dimension Sn/NnS_n / N_n5, making them applicable in infinite-dimensional online learning environments.

3. Cramér and Moderate Deviation Principles

Self-normalised moderate deviation theorems provide sharp approximations to tail probabilities of Sn/NnS_n / N_n6 relative to the standard normal tail Sn/NnS_n / N_n7. Under finite Sn/NnS_n / N_n8-th moment and mild regularity conditions on the martingale differences Sn/NnS_n / N_n9,

NnN_n0

for some NnN_n1, and error terms NnN_n2, NnN_n3 controlling higher conditional moments and the uniformity of quadratic variation (Fan et al., 2017, Fan et al., 2023). Thus,

NnN_n4

holds uniformly over a "moderate deviations" regime NnN_n5, under broad conditions.

This result provides theoretical justification for normal-approximation–based inference (e.g., t-tests) in heteroscedastic and dependent settings.

4. Berry–Esseen Bounds and Normal Approximations

For martingale difference sequences with finite NnN_n6-th moment (NnN_n7), the Berry–Esseen bound for the self-normalized sum is

NnN_n8

for some constant NnN_n9 and [S]n=i=1nXi2[S]_n = \sum_{i=1}^n X_i^20 an aggregated moment/deviation error (Fan et al., 2017). This matches the rate for standardized martingale CLTs and is optimal in order.

Refined nonuniform bounds, decaying polynomially in the tails, are also available: [S]n=i=1nXi2[S]_n = \sum_{i=1}^n X_i^21 (Wu et al., 2021).

These results extend the robustness and precision of self-normalised normal approximations far beyond the i.i.d. case, accommodating dependence and heavy tails.

5. Banach- and Hilbert-Space Extensions

Self-normalisation principles generalize to Banach spaces. For a [S]n=i=1nXi2[S]_n = \sum_{i=1}^n X_i^22-uniformly smooth Banach space [S]n=i=1nXi2[S]_n = \sum_{i=1}^n X_i^23 ([S]n=i=1nXi2[S]_n = \sum_{i=1}^n X_i^24), if [S]n=i=1nXi2[S]_n = \sum_{i=1}^n X_i^25 is a (conditionally symmetric) [S]n=i=1nXi2[S]_n = \sum_{i=1}^n X_i^26-valued martingale with differences [S]n=i=1nXi2[S]_n = \sum_{i=1}^n X_i^27, the self-normalised concentration bound reads: [S]n=i=1nXi2[S]_n = \sum_{i=1}^n X_i^28 where [S]n=i=1nXi2[S]_n = \sum_{i=1}^n X_i^29 is determined by the geometry of Sn=i=1nE[Xi2Fi1]\langle S \rangle_n = \sum_{i=1}^n \mathbb{E}[X_i^2|\mathcal{F}_{i-1}]0 (Luo, 2019). Hilbert-space (Sn=i=1nE[Xi2Fi1]\langle S \rangle_n = \sum_{i=1}^n \mathbb{E}[X_i^2|\mathcal{F}_{i-1}]1) self-normalisation yields dimension-free, sub-Gaussian behavior with respect to the "random scale."

Such self-normalised martingale inequalities are integral to the concentration theory of random matrices and learning in infinite-dimensional feature spaces.

6. Applications in Sequential Learning, Bandits, and Inference

Self-normalised martingale inequalities underpin sharp confidence sets and regret bounds in online learning, kernelized bandits, and high-dimensional regression.

  • Kernel Logistic Regression: The dimension-free Bernstein inequality enables anytime, computationally feasible confidence sequences for parameter estimation, scaling with the curvature of the loss in RKHS (Akhavan et al., 28 Jul 2025).
  • Kernelized Bandits: Regret bounds become instance-adaptive—controlled by the variance Sn=i=1nE[Xi2Fi1]\langle S \rangle_n = \sum_{i=1}^n \mathbb{E}[X_i^2|\mathcal{F}_{i-1}]2 of the optimal arm, rather than by loose worst-case bounds, with leading term Sn=i=1nE[Xi2Fi1]\langle S \rangle_n = \sum_{i=1}^n \mathbb{E}[X_i^2|\mathcal{F}_{i-1}]3 (Akhavan et al., 28 Jul 2025).
  • Student’s t-Statistic and AR Processes: Moderate deviation and Berry–Esseen results yield nonasymptotic distributional control and valid inference with unknown variances and dependent data (Fan et al., 2017, Fan et al., 2017, Fan et al., 2023).
  • Credit Risk and Density Modeling: Dynamic measure-valued SDEs with self-normalisation ensure that evolving conditional densities remain proper probability measures—critical in risk modeling (Song, 2014).

7. Extensions, Categorical, and Structural Perspectives

Recent categorical treatments interpret martingales (and self-normalised versions) as cones or coherent families in enriched category theory (Belle, 2023). Conditional expectation and normalization emerge organically via Kan extensions and limit constructions in metric-enriched categories, providing a structural explanation for isometric convergence and self-normalised scaling.

Further structural advances include decompositions of Sn=i=1nE[Xi2Fi1]\langle S \rangle_n = \sum_{i=1}^n \mathbb{E}[X_i^2|\mathcal{F}_{i-1}]4-martingales as infinite sums of martingales with independent increments (and, in Brownian filtrations, sums of Gaussian martingales), with quadratic variation precisely split among the components—a direct link to self-normalisation and spectral expansions (Delbaen, 2024).

Summary Table: Selected Results

Inequality or Principle Setting (Value/Norm) Key Feature/Bound Reference
Bernstein-type tail bound (dimension-free) Sn=i=1nE[Xi2Fi1]\langle S \rangle_n = \sum_{i=1}^n \mathbb{E}[X_i^2|\mathcal{F}_{i-1}]5-valued Sn=i=1nE[Xi2Fi1]\langle S \rangle_n = \sum_{i=1}^n \mathbb{E}[X_i^2|\mathcal{F}_{i-1}]6 (Akhavan et al., 28 Jul 2025)
Berry-Esseen bound for self-normalised sum Real-valued Sn=i=1nE[Xi2Fi1]\langle S \rangle_n = \sum_{i=1}^n \mathbb{E}[X_i^2|\mathcal{F}_{i-1}]7 (Fan et al., 2017)
Moderate deviation: Sn=i=1nE[Xi2Fi1]\langle S \rangle_n = \sum_{i=1}^n \mathbb{E}[X_i^2|\mathcal{F}_{i-1}]8 Real-valued Uniform over Sn=i=1nE[Xi2Fi1]\langle S \rangle_n = \sum_{i=1}^n \mathbb{E}[X_i^2|\mathcal{F}_{i-1}]9 (Fan et al., 2023)
Banach-space self-normalised Azuma (Sn+ρI)12Sn(\langle S \rangle_n + \rho I)^{-\frac{1}{2}} S_n0-uniformly smooth (Sn+ρI)12Sn(\langle S \rangle_n + \rho I)^{-\frac{1}{2}} S_n1 (Sn+ρI)12Sn(\langle S \rangle_n + \rho I)^{-\frac{1}{2}} S_n2 (Luo, 2019)
Dynamic SDE for self-normalised density functions Measure-valued (Sn+ρI)12Sn(\langle S \rangle_n + \rho I)^{-\frac{1}{2}} S_n3, (Sn+ρI)12Sn(\langle S \rangle_n + \rho I)^{-\frac{1}{2}} S_n4 (Song, 2014)

Concluding Remarks

Self-normalised martingales unify probabilistic, statistical, and learning-theoretic perspectives by providing robust, adaptive control of deviation, concentration, and limit behavior. Advances in high-dimensional and nonparametric regimes—facilitated by dimension-free and variance-adaptive bounds—have fundamentally broadened their impact across theoretical and applied disciplines. The developments surveyed address both foundational questions (e.g., moderate deviations, Berry–Esseen rates) and emerging applications (sequential learning, instance-adaptive inference, kernel bandits), with further generalizations—categorical, structural, or geometric—continuing to extend their reach (Akhavan et al., 28 Jul 2025, Fan et al., 2023, Song, 2014, Luo, 2019, Delbaen, 2024).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Self-Normalised Martingales.