Almost-Sure Regret Bound in Online Prediction

Updated 23 November 2025

Almost-sure regret bounds are rigorous guarantees on cumulative loss that hold with probability one, eliminating fixed failure rates in online multi-step-ahead prediction.
They leverage conditional distribution theory and self-normalized martingale inequalities to achieve logarithmic regret rates with polynomial scaling in the prediction horizon.
These bounds enable robust, real-time deployment of online forecasting algorithms that closely approximate Bayesian predictors even in complex, linear stochastic systems.

An almost-sure regret bound is a guarantee on the difference between the cumulative loss incurred by an online forecasting algorithm and the cumulative loss of a benchmark (such as the best fixed predictor, a convex aggregation, or an optimal Kalman filter), holding with probability one over the realizations of the stochastic process. In contrast to bounds expressed in expectation or with high probability, almost-sure bounds eliminate fixed failure rates and quantify regret at all sufficiently large time horizons. In multi-step-ahead time series prediction, recent research has established almost-sure logarithmic regret rates, with polynomial scaling in the prediction horizon.

1. Regret in Online Multi-Step-Ahead Prediction

Regret quantifies excess loss relative to an optimal reference. In the context of online multi-step forecasting for linear stochastic systems, let $\tilde y_{k+H}$ be the forecast of the algorithm for step $k+H$ and $\bar y_{k+H}$ be the Bayesian optimal predictor (e.g., multi-step Kalman filter). The cumulative regret up to horizon $N$ is

$\mathcal R_N = \sum_{k=1}^N \lVert y_{k+H}-\tilde y_{k+H}\rVert^2 - \sum_{k=1}^N \lVert y_{k+H}-\bar y_{k+H}\rVert^2$

where $y_{k+H}$ are the true observations. The goal is to bound $\mathcal R_N$ as $N\to\infty$ .

2. Almost-Sure Regret Bound: Formal Statement and Techniques

The almost-sure regret bound guarantees that, for all sufficiently large $N$ , the excess loss exhibits no dependence on fixed failure probabilities. In the context of linear systems,

$\mathcal R_N \le M H^{4\kappa+1}\beta^3\,\mathcal O(\log^7 N)$

with probability one, where $H$ is the prediction horizon; $\kappa$ is the size of the largest Jordan block at eigenvalue $1$ in the system matrix $A$ ; $\beta=c(\kappa+\log H)/\log(1/\rho(A-LC))$ ; $M$ is a system-dependent constant; $\rho(A-LC)$ is the spectral radius of the filter update matrix. This result dispenses with confidence levels: the logarithmic regret and its scaling hold on the sample path, not merely in probability (Qian et al., 16 Nov 2025).

Proof techniques combine conditional distribution theory, autoregressive regression parametrization, and self-normalized martingale inequalities with $\delta=1/N$ error decay, so no fixed failure rate parameter persists. Each source of error (bias from truncated back horizon $p$ , regression error from non-orthogonality, and accumulation of self-normalized terms) is controlled almost surely.

3. Implications for Multi-Step Forecasting Algorithms

The almost-sure regret bound establishes that (i) online least-squares prediction in unknown linear stochastic systems tracks the optimal multi-step Bayesian predictor up to an $\mathcal O(\log N)$ excess for large $N$ , and (ii) the multiplicative constant grows polynomially with the forecast horizon $H$ , parameterized by the algebraic structure of $A$ . In practical terms, this confirms that such algorithms adapt in nonstationary environments with provable performance guarantees, enabling deployment without need for repeated failure-probability tuning.

For systems with marginal stability ( $\kappa>1$ ), long-horizon prediction becomes more difficult but remains feasible if $H$ is moderate. The backward horizon $p$ required for bias control scales as $O((\kappa+\log H)/\log(1/\rho(A-LC))\cdot \log N)$ (Qian et al., 16 Nov 2025).

4. Comparison with Prior Adversarial and Probabilistic Regret Bounds

Earlier works in online prediction, notably the prediction-with-expert-advice (PEA) smoothing framework, provided adversarial $O(\ln T)$ regret bounds for both point forecast and function aggregation. However, those results were typically stated for single-step or fixed-horizon prediction and in terms of worst-case or expectation bounds (Korotin et al., 2017). The almost-sure bound surpasses these by ensuring the excess loss vanishes asymptotically on virtually every realization, not only in expectation or up to pre-specified confidence.

In expert aggregation settings, adaptive conformal prediction also achieves long-run coverage control, but the bounds are on empirical coverage, not prediction error regret (Sousa et al., 2022, Szabadváry, 2024). Likewise, feature-adaptation approaches control loss empirically (e.g., mean squared error) but do not provide almost-sure regret guarantees (Huang et al., 4 Sep 2025).

5. Statistical Significance and Polynomial Scaling

The polynomial scaling of the regret constant with $H$ arises from the spectral and algebraic properties of the system matrix. The AR-type recursion satisfied by the regressor vector $Z_{k,p}$ and corresponding bounds on quadratic forms yield polynomial growth in $H$ proportional to the largest Jordan block degree. This quantifies how error propagation in forecast horizons is governed by system stability, as opposed to probabilistic concentration.

6. Practical Considerations

Model Selection: Backward horizon $p$ should be chosen as $p\propto (\log H)\,\log N$ for stability and low bias.
Computational Complexity: The doubling-epoch update scheme ensures computational cost remains $O(\log N)$ .
Applicability: Almost-sure regret bounds are valid for general linear systems with steady-state Kalman filter approximations and apply to practical deployments with no tuning for error rates.

7. Summary Table: Regret Bound Types in Multi-Step Time Series Prediction

Bound Type	Expression	Probability	Horizon Scaling
Expected regret	$\mathbb E[\mathcal R_N]\le O(\log N)$	In expectation	Typically sublinear
High-prob. regret	$\mathcal R_N\le O(\log N)$ w.p. $1-\delta$	For $\delta\in(0,1)$	Sublinear/polynomial
Almost-sure regret	$\mathcal R_N\le C_H \log^k N$	With prob. $1$	$C_H\propto H^{4\kappa+1}$

In summary, almost-sure regret bounds represent the strongest convergence paradigm for online multi-step-ahead prediction in linear stochastic systems, ensuring robust adaptation with respect to predictive error for all sample paths, and clarifying how system dynamics induce scaling effects with horizon length (Qian et al., 16 Nov 2025).