Sequential Likelihood Ratios in Inference

Updated 7 May 2026

Sequential Likelihood Ratios are adaptive statistics that extend classical likelihood ratios to sequential, evolving data by incorporating plug-in estimators and mixing distributions.
They leverage martingale properties to guarantee uniform coverage for confidence sequences and control errors in dynamic hypothesis tests.
Their integration with online optimization and mixture frameworks enhances applications in bandits, survival analysis, and distributed detection with non-asymptotic precision.

Sequential likelihood ratios constitute a foundational class of statistics used in sequential analysis, adaptive confidence set construction, multistage hypothesis testing, and distributed inference. These ratios provide the operational basis for sequential tests and confidence sequences, enabling anytime-valid inference and efficient sequential decision-making. At their core, sequential likelihood ratios generalize the classical likelihood ratio to dynamically accommodate data that arrive sequentially, possibly under arbitrary adaptivity, and admit extensions to model mixing, non-i.i.d. observations, and distributed architectures.

1. Formal Definition and Variants

The basic sequential likelihood ratio statistic for parameter $\theta$ against a sequence of plug-in estimators $\{ \theta_s \}$ is defined as

$R_t(\theta) = \frac{L_t(\{\theta_s\})}{L_t(\theta)} = \prod_{s=1}^t \frac{p_{\theta_s}(y_s|x_s)^{w_s}}{p_\theta(y_s|x_s)^{w_s}},$

where each $w_s\in(0,1]$ is an $F_{s-1}$ -measurable weight and $p_{\theta}(y_s|x_s)$ is the parametric likelihood for observation $(x_s, y_s)$ (Emmenegger et al., 2023). In the sequential mixture likelihood ratio framework, a more general form is given by integrating over a proposal or predictive distribution $q_t$ and a mixing prior $\pi$ :

$M_n = \int_{\Theta} \prod_{t=1}^n \frac{p_\theta(X_t)}{q_t(X_t|X_{1:t-1})} \,\pi(d\theta)$

(Kirschner et al., 20 Feb 2025). These statistics reduce to classical tests when $\{ \theta_s \}$ 0 and $\{ \theta_s \}$ 1 are nonadaptive, but allow for strong model-agnostic and adaptivity-preserving inference under broader scenarios.

2. Martingale Properties and Confidence Sequences

A central property of sequential likelihood ratio processes is that, under the true data-generating parameter $\{ \theta_s \}$ 2, $\{ \theta_s \}$ 3 forms a nonnegative supermartingale with respect to the filtration generated by the data sequence (Emmenegger et al., 2023, Kirschner et al., 20 Feb 2025). By Ville’s inequality, this leads to uniform coverage guarantees for anytime-valid confidence sets:

$\{ \theta_s \}$ 4

satisfying

$\{ \theta_s \}$ 5

An analogous martingale property holds for mixture-based ratios $\{ \theta_s \}$ 6 (Kirschner et al., 20 Feb 2025), resulting in confidence sequences $\{ \theta_s \}$ 7 that retain the prescribed coverage $\{ \theta_s \}$ 8 for all $\{ \theta_s \}$ 9 simultaneously under minimal regularity.

3. Hypothesis Testing with Sequential Likelihood Ratios

Sequential likelihood ratios underlie sequential probability ratio tests (SPRT) and their multilevel generalizations. For $R_t(\theta) = \frac{L_t(\{\theta_s\})}{L_t(\theta)} = \prod_{s=1}^t \frac{p_{\theta_s}(y_s|x_s)^{w_s}}{p_\theta(y_s|x_s)^{w_s}},$ 0 alternative hypotheses partitioning the parameter space, one constructs $R_t(\theta) = \frac{L_t(\{\theta_s\})}{L_t(\theta)} = \prod_{s=1}^t \frac{p_{\theta_s}(y_s|x_s)^{w_s}}{p_\theta(y_s|x_s)^{w_s}},$ 1 consecutive likelihood ratio processes:

$R_t(\theta) = \frac{L_t(\{\theta_s\})}{L_t(\theta)} = \prod_{s=1}^t \frac{p_{\theta_s}(y_s|x_s)^{w_s}}{p_\theta(y_s|x_s)^{w_s}},$ 2

where $R_t(\theta) = \frac{L_t(\{\theta_s\})}{L_t(\theta)} = \prod_{s=1}^t \frac{p_{\theta_s}(y_s|x_s)^{w_s}}{p_\theta(y_s|x_s)^{w_s}},$ 3 are bounding values for region $R_t(\theta) = \frac{L_t(\{\theta_s\})}{L_t(\theta)} = \prod_{s=1}^t \frac{p_{\theta_s}(y_s|x_s)^{w_s}}{p_\theta(y_s|x_s)^{w_s}},$ 4 (Chen, 2012). Stopping and decision rules are based on the chain of ratios crossing upper and lower thresholds $R_t(\theta) = \frac{L_t(\{\theta_s\})}{L_t(\theta)} = \prod_{s=1}^t \frac{p_{\theta_s}(y_s|x_s)^{w_s}}{p_\theta(y_s|x_s)^{w_s}},$ 5 to select a unique hypothesis. Explicit finite-sample error control is achieved by selecting thresholds to satisfy:

$R_t(\theta) = \frac{L_t(\{\theta_s\})}{L_t(\theta)} = \prod_{s=1}^t \frac{p_{\theta_s}(y_s|x_s)^{w_s}}{p_\theta(y_s|x_s)^{w_s}},$ 6

For $R_t(\theta) = \frac{L_t(\{\theta_s\})}{L_t(\theta)} = \prod_{s=1}^t \frac{p_{\theta_s}(y_s|x_s)^{w_s}}{p_\theta(y_s|x_s)^{w_s}},$ 7, this construction reduces to the classical SPRT, providing optimal sample efficiency and explicit Type I/II error rates.

4. Sequential Likelihood Ratio Construction and Algorithmic Connections

Selecting the estimator sequence $R_t(\theta) = \frac{L_t(\{\theta_s\})}{L_t(\theta)} = \prod_{s=1}^t \frac{p_{\theta_s}(y_s|x_s)^{w_s}}{p_\theta(y_s|x_s)^{w_s}},$ 8 to minimize regret with respect to log-likelihood losses connects sequential likelihood ratio inference to online convex optimization. The Follow-the-Regularized-Leader (FTRL) procedure,

$R_t(\theta) = \frac{L_t(\{\theta_s\})}{L_t(\theta)} = \prod_{s=1}^t \frac{p_{\theta_s}(y_s|x_s)^{w_s}}{p_\theta(y_s|x_s)^{w_s}},$ 9

with $w_s\in(0,1]$ 0 strongly convex, minimizes the cumulative regret $w_s\in(0,1]$ 1 between the cumulative loss of the plug-in predictor and the (unknowable) true parameter (Emmenegger et al., 2023). Bias in early estimators can be addressed with data-dependent weights:

$w_s\in(0,1]$ 2

with $w_s\in(0,1]$ 3, mitigating inflation due to initial estimation uncertainty.

Furthermore, the sequential likelihood ratio approach admits immediate extension to infinite-dimensional settings via the representer theorem, allowing application in RKHSs and nonparametric models (Emmenegger et al., 2023). Sequential mixing with adaptive choice of $w_s\in(0,1]$ 4 or posterior-based priors enables computationally tractable and theoretically grounded uncertainty quantification in high dimensions (Kirschner et al., 20 Feb 2025).

5. Non-Asymptotic Analysis and Geometric Properties

In generalized linear models (GLMs) with exponential family structure,

$w_s\in(0,1]$ 5

confidence sets induced by sequential likelihood ratios exhibit precise non-asymptotic geometric properties. Letting $w_s\in(0,1]$ 6 be the regularized log-partition sum and $w_s\in(0,1]$ 7 the Bregman divergence, Theorem 3 in (Emmenegger et al., 2023) establishes that for all $w_s\in(0,1]$ 8 and all $w_s\in(0,1]$ 9,

$F_{s-1}$ 0

For classical linear regression ( $F_{s-1}$ 1), this yields the well-known ellipsoidal bound $F_{s-1}$ 2, matching established uniform confidence sequences.

6. Extensions: Mixtures, Misspecification, and Bayesian Integration

The sequential likelihood mixing framework (Kirschner et al., 20 Feb 2025) generalizes likelihood ratio confidence sequences by replacing fixed parametric models with arbitrary mixtures and predictive distributions. Coverage and validity extend without reliance on i.i.d. data, remaining robust under misspecified models by employing sub-Gaussian proxy likelihoods or employing e-values satisfying appropriate supermartingale bounds. Central theorems demonstrate that both standard Bayesian posteriors and variational posterior approximations (via the ELBO) yield anytime-valid confidence sets, with regret bounds from online learning directly translatable into confidence set radii.

Moreover, the mixture framework recovers several classical and modern results:

Ellipsoidal confidence regions in sequential linear regression match the Abbasi-Yadkori–Pál bounds.
Sparsity-adaptive confidence sets are constructed using spike-and-slab priors for high-dimensional regression settings.
Regret-to-confidence conversion is enabled for any online log-loss forecaster, encompassing classical exponential weighting and variational techniques.

7. Applications: Bandits, Survival Analysis, and Distributed Detection

Sequential likelihood ratios and their mixture counterparts enable a spectrum of applied adaptive inference procedures.

In generalized linear bandits, UCB-style selection based on likelihood ratio confidence sets yields provable regret bounds matching $F_{s-1}$ 3 (Emmenegger et al., 2023).
For additive noise settings (Gaussian or Laplace), likelihood ratio methods produce valid, tight confidence sequences even when classical sub-Gaussian intervals fail, adapting to heavy-tailed scenarios.
In survival analysis, for models such as the Weibull hazard model, transformation yields exponential family forms with likelihood-ratio-based confidence sets being convex and leading to effective uncertainty quantification for hazard rates.
For distributed detection in cell-free massive MIMO, local access points forward partial sufficient statistics that are sequentially fused, ultimately resulting in globally optimal log-likelihood ratios (LLRs) for soft decoding at a CPU while minimizing fronthaul bandwidth requirements. This sequential computation matches the error rate performance of fully centralized computation for phase-shift keying constellations (Shaik et al., 2021).

In all cases, empirical findings indicate that likelihood ratio-based or sequential likelihood mixing confidence sequences not only ensure rigorous anytime-valid coverage but routinely yield smaller uncertainty regions, leading to improvements in regret and pointwise inference relative to traditional methods (Emmenegger et al., 2023, Kirschner et al., 20 Feb 2025).

Markdown Report Issue Upgrade to Chat

References (4)

Likelihood Ratio Confidence Sets for Sequential Decision Making (2023)

Confidence Estimation via Sequential Likelihood Mixing (2025)

Consecutive Sequential Probability Ratio Tests of Multiple Statistical Hypotheses (2012)

Distributed Computation of A Posteriori Bit Likelihood Ratios in Cell-Free Massive MIMO (2021)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Sequential Likelihood Ratios.

Sequential Likelihood Ratios in Inference

1. Formal Definition and Variants

2. Martingale Properties and Confidence Sequences

3. Hypothesis Testing with Sequential Likelihood Ratios

4. Sequential Likelihood Ratio Construction and Algorithmic Connections

5. Non-Asymptotic Analysis and Geometric Properties

6. Extensions: Mixtures, Misspecification, and Bayesian Integration

7. Applications: Bandits, Survival Analysis, and Distributed Detection

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Sequential Likelihood Ratios in Inference

1. Formal Definition and Variants

2. Martingale Properties and Confidence Sequences

3. Hypothesis Testing with Sequential Likelihood Ratios

4. Sequential Likelihood Ratio Construction and Algorithmic Connections

5. Non-Asymptotic Analysis and Geometric Properties

6. Extensions: Mixtures, Misspecification, and Bayesian Integration

7. Applications: Bandits, Survival Analysis, and Distributed Detection

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research