Confidence Sequence Methods

Updated 29 May 2026

Confidence Sequence Methodology defines a series of confidence sets that guarantee uniform coverage over time, accommodating continuously updated data and adaptive stopping rules.
It employs martingale-based techniques, including exponential and mixture constructions, to offer anytime-valid statistical inference with minimal distributional assumptions.
The approach integrates efficiently with sequential experiments, online learning, and adaptive trials, ensuring robust performance even under heavy-tailed or contaminated data scenarios.

A confidence sequence is a sequence of confidence sets for a parameter of interest that is valid uniformly over time, typically under minimal distributional assumptions. Unlike fixed-sample confidence intervals, confidence sequences provide simultaneous coverage over an unbounded or arbitrary sequence of data-dependent stopping times. These tools are crucial for sequential analysis, online learning, adaptive experimentation, bandit problems, risk mitigation in online experiments, and anytime-valid inference, balancing statistical rigor with the flexibility required for modern sequential workflows.

1. Definitions, Coverage, and General Principles

Given stochastic process data $(Y_t)_{t\geq1}$ (possibly vector-valued) and a target parameter $\theta^*$ (often a mean or quantile), a confidence sequence (CS) is a sequence $\{C_t\}_{t\geq1}$ of random sets such that, for all possible parameter values and any (potentially data-dependent) stopping time $\tau$ ,

$\mathbb{P}(\forall t\geq 1: \theta^* \in C_t) \geq 1 - \alpha.$

This property is called "anytime validity" or "uniform coverage." This uniformity guarantees the inference remains valid even under optional stopping, arbitrary peeking, or adaptive decision-making (Howard et al., 2018, 2002.03658, Ding et al., 2016).

Confidence sequences can be tailored to bounded means (Kilian et al., 8 May 2026, Ryu et al., 2024), unbounded/heavy-tailed means (Wang et al., 2022, Bhatt et al., 2022), quantiles (Howard et al., 2019), regression and generalized linear models (Clerico et al., 23 Apr 2025, Kirschner et al., 20 Feb 2025), and even matrix-valued parameters (Howard et al., 2018) or quantum states (Cumitini et al., 28 Jan 2026). The key technical ingredient is often a nonnegative supermartingale or test martingale, coupled with Ville's maximal inequality, which provides time-uniform error control.

2. Core Methodologies: Martingale and Mixture Constructions

The heart of modern confidence sequence construction is the exponential supermartingale approach. If one constructs a process $(M_t(\mu))_{t\geq 0}$ such that:

$M_0(\mu)=1$ ,
$(M_t(\mu))$ is a supermartingale when $\mu$ is the true parameter,

then Ville's inequality yields: $\mathbb{P}\left(\exists t \geq 1: M_t(\mu) \geq 1/\alpha\right) \leq \alpha.$ Inverting this yields at each time $\theta^*$ 0 a set: $\theta^*$ 1 with simultaneous coverage (Howard et al., 2018).

A common recipe is to construct $\theta^*$ 2 via exponential or mixture-exponential forms—using Cramér-Chernoff style bounds for light tails (Howard et al., 2018, Kilian et al., 8 May 2026), robust Catoni-type forms for heavy tails (Wang et al., 2022, Bhatt et al., 2022, Wang et al., 2023), or more general mixture martingales (Kirschner et al., 20 Feb 2025, Cortinovis et al., 28 Jun 2025, Wang et al., 2023). For multivariate and vector-valued processes, a gambling/betting perspective is increasingly favored, allowing uniform coverage in higher dimensions via portfolio strategies (Ryu et al., 2024).

In parametric settings, mixture over the likelihood ratio yields a nonnegative martingale as in Robbins' construction,

$\theta^*$ 3

leading to CSs that obey the likelihood principle and can incorporate prior information (2002.03658, Cortinovis et al., 28 Jun 2025).

Self-normalized martingales, empirical-Bernstein-type techniques, and polynomial or stitched uniform boundaries provide refined, data-dependent width adaptation, often reducing asymptotic width by exploiting observed variance or heavy-tailed-tail decay (Howard et al., 2018, Mineiro, 2022).

3. Heavy-Tailed, Robust, and Adaptive Confidence Sequences

Confidence sequences have been generalized to accommodate heavy-tailed or contaminated data. Catoni-style influence functions, which dampen the effect of outliers, are used to construct test martingales that require only a bounded $\theta^*$ 4-th central moment (for some $\theta^*$ 5), rather than bounded variance or sub-Gaussian tails: $\theta^*$ 6 where $\theta^*$ 7 is a log-influence function with bounded range, and $\theta^*$ 8 is calibrated for coverage (Wang et al., 2022, Bhatt et al., 2022, Wang et al., 2023).

In the presence of adversarial corruption, Huber-robust confidence sequences provide optimal-width intervals for the mean under total variation contamination. They leverage robust Catoni-style supermartingales with an offset in their control term, yielding: $\theta^*$ 9 where $\{C_t\}_{t\geq1}$ 0 is the TV contamination radius, and $\{C_t\}_{t\geq1}$ 1 controls the second moment under the uncontaminated law (Wang et al., 2023). This yields near-minimax width $\{C_t\}_{t\geq1}$ 2, tighter than sequentialization of fixed-time robust intervals.

Adaptive and data-driven tuning of tradeoff parameters (such as $\{C_t\}_{t\geq1}$ 3 or prior weights) further reduces unnecessary conservatism and allows recovery of LIL ( $\{C_t\}_{t\geq1}$ 4) or minimax rates for interval shrinking (Howard et al., 2018, Wang et al., 2022, Bhatt et al., 2022).

4. Regret, Online Learning, and Generalized Models

Confidence sequence methodology seamlessly integrates with online learning, regret analysis, and sequential generalized linear models (GLMs) (Clerico et al., 23 Apr 2025, Kirschner et al., 20 Feb 2025). Given an online forecaster (predictive or mixture), the regret against the log-likelihood risk determines the diameter of the CS: $\{C_t\}_{t\geq1}$ 5 For GLMs, this typically recovers minimax-optimal diameter $\{C_t\}_{t\geq1}$ 6 (Clerico et al., 23 Apr 2025).

Martingale arguments, online density estimation, and universal portfolio strategies yield multi-dimensional or matrix-valued confidence sets with correct anytime validity. For example, gambling-based confidence sequences use a wealth process that corresponds to an optimal betting strategy, such as Cover's universal portfolio, yielding convex CSs for bounded random vectors (Ryu et al., 2024). In the context of sampling without replacement, a martingale construction for the Bayesian posterior-prior ratio delivers exact uniform coverage and incorporates the finite-population variance gain compared to with-replacement bounds (Waudby-Smith et al., 2020).

5. Implementation, Practical Concerns, and Specializations

Practical computation of confidence sequences is efficient. Most modern forms require only $\{C_t\}_{t\geq1}$ 7 per-data update, via sufficient statistics or incremental martingale/wealth updates (Howard et al., 2018, Kilian et al., 8 May 2026, Wang et al., 2022, Ryu et al., 2024). When inverting monotonic one-dimensional functions (as in Catoni-type or robust CSs), standard root-finding methods suffice and benefit from warm starts due to continuity over time (Wang et al., 2023).

Regularization via informative priors can produce much sharper CSs when prior and data are aligned, while bounded-influence priors prevent vacuous intervals under prior misspecification (Cortinovis et al., 28 Jun 2025). This paradigm extends naturally to Bayesian nonparametric working predictives, yielding asymptotically log-optimal width under mild conditions (Kilian et al., 8 May 2026).

Extensions cover a wide range of settings:

Sequential quantile estimation and uniform confidence bands for the entire quantile function (Howard et al., 2019),
Matrix martingales and time-uniform covariance estimation (Howard et al., 2018),
Quantum state tomography and desired anytime-valid inference in quantum experiments (Cumitini et al., 28 Jan 2026),
Risk-mitigating confidence sequences for adaptive online experiments, A/B/n tests, panel data, and bandits (Ham et al., 2022).

Specialized recipes (e.g., for design-based or panel experiments) provide fine-grained control and variance adaptation via proxy covariates, further increasing stopping efficiency and experimental safety (Ham et al., 2022).

6. Statistical Guarantees, Rates, and Lower Bounds

Asymptotic and nonasymptotic properties of confidence sequences have been thoroughly characterized. For bounded or sub-Gaussian outcomes, width shrinks at near-optimal $\{C_t\}_{t\geq1}$ 8 (law of the iterated logarithm rate), which cannot be improved by any valid, tail-symmetric CS (Howard et al., 2018, Wang et al., 2022). For heavy-tailed data with only a bounded $\{C_t\}_{t\geq1}$ 9-th moment, the optimal rate for width is $\tau$ 0 (Wang et al., 2022, Bhatt et al., 2022).

In the special case of the Gaussian mean with unknown variance (t-type CS), the minimax width for level $\tau$ 1 and $\tau$ 2 samples grows as $\tau$ 3, with polynomial dependence on $\tau$ 4 both unavoidable and attainable by recent e-process-based CSs (Wang et al., 2023). These achieve anytime validity and match or surpass the classical fixed-n t-intervals in width for small $\tau$ 5 and finite $\tau$ 6.

7. Applications and Impact in Statistical Practice

Confidence sequence methodology is now a central tool for:

Sequential A/B/n testing and entirely risk-mitigating online and adaptive experimentation (Ham et al., 2022, Wang et al., 2023, Howard et al., 2019),
Multi-armed bandit algorithms and best-arm identification under data-dependent stopping (Ryu et al., 2024, Kilian et al., 8 May 2026, Howard et al., 2019),
Off-policy evaluation, contextual bandits, and mission-critical sequential decisions under unbounded and heavy-tailed importance weights (Mineiro, 2022),
Replicability analysis, meta-adaptive inference, and addressing the "peeking" problem in survey sciences (2002.03658, Ding et al., 2016).

The resulting intervals or sets are always-valid, meaning analysts do not pay additional error penalties for stopping early, monitoring adaptively, or updating in real time.

In conclusion, confidence sequence methodology provides a unified theoretical and computational framework for nonasymptotic, uniformly valid inference in sequential, online, and adaptive data analyses. Its flexibility, robustness to heavy tails and corruptions, efficiency for prior-knowledge exploitation, and ease of implementation make it the statistical backbone of modern sequential experimentation and inference pipelines. For detailed methodology, computations, and proofs, see (Howard et al., 2018, Wang et al., 2022, Wang et al., 2023, Ryu et al., 2024, Ham et al., 2022, Kilian et al., 8 May 2026, Cortinovis et al., 28 Jun 2025, Clerico et al., 23 Apr 2025, 1310.03722).