Extended Robbins-Siegmund Theorem

Updated 7 November 2025

The extended Robbins-Siegmund theorem offers quantitative convergence rates for stochastic processes, overcoming classical limitations with non-summable noise conditions.
It unifies varied iterative schemes by integrating uniform and metastable forms to provide explicit mean and almost sure convergence rates with high probability bounds.
These enhancements broaden applications in reinforcement learning, stochastic approximation, and optimization by effectively controlling noise and increment regimes.

The extensions of the Robbins-Siegmund theorem address both foundational and practical limitations inherent in the original version, which concerns the almost sure convergence of nonnegative stochastic processes exhibiting an "almost supermartingale" structure. New developments provide quantitative convergence rates, broaden the theorem's applicability to non-summable noise settings, integrate uniform and metastable quantitative forms, and unify disparate iterative schemes under generalized regularity assumptions.

1. Classical Robbins-Siegmund Theorem and Limitations

The classical Robbins-Siegmund framework asserts that if $(X_n)$ is an adapted sequence of non-negative integrable random variables with respect to a filtration $(\mathcal{F}_n)$ , and for deterministic sequences $(a_n), (B_n), (C_n)$ of non-negative reals, the process satisfies

$\mathbb{E}[X_{n+1}|\mathcal{F}_n] \le (1 + a_n)X_n - B_n + C_n \quad \text{a.s.},$

with $\sum_n a_n < \infty$ , $\sum_n C_n < \infty$ almost surely, then almost surely $X_n$ converges and $\sum_n B_n < \infty$ .

A major drawback is its qualitative nature: no information about rates of convergence is available. Furthermore, critical applications—especially in reinforcement learning and general stochastic approximation—fail to satisfy the condition $\sum_n C_n < \infty$ , as the zero-order ("noise") term may be only square summable or have even weaker control.

2. Relaxed Supermartingale Conditions and Quantitative Extensions

Recent extensions provide general convergence theorems for processes obeying a relaxed supermartingale property: $\mathbb{E}[X_{n+1} | \mathcal{F}_n] \leq (1 + A_n)X_n + C_n \quad \text{a.s.}$ Key quantitative results (see (Neri et al., 17 Apr 2025)) replace pointwise decrements with mean or almost sure reduction of a transformed process $f(X_n)$ , where $f$ is strictly increasing, concave, continuous, and supermultiplicative.

Quantitative rates are derived in both mean and almost sure senses, uniform over all but minimal problem data—namely, uniform product and summability controls, and moduli for continuity and regularity. Explicitly (with notational details below):

Mean rate: $E[f(X_n)] < \varepsilon$ after $n \geq \rho(\varepsilon)$ , with

$\rho(\varepsilon) := \varphi\left(\frac{\varepsilon \psi(K^{-1})}{2}, \chi\left( \kappa \left( \frac{\varepsilon \psi(K^{-1})}{2} \right) \right) \right).$

Almost sure rate: for any $\lambda > 0$ ,

$P(\exists n \geq \rho'(\lambda, \varepsilon): X_n \geq \varepsilon) < \lambda, \qquad \rho'(\lambda, \varepsilon) := \rho(\lambda f(\varepsilon)).$

This formulation applies to cases where only a function $f(X_n)$ converges (e.g., root, logarithm), and can recover classical rates such as $O(1/\sqrt{n})$ or even linear rates in strong regularity regimes.

3. Beyond Summability: Square Summable Regimes and Controlled Increments

A significant extension is achieved by relaxing the requirement that $\sum x_n < \infty$ . In (Liu et al., 30 Sep 2025), the summability is replaced by square summability ( $\sum x_n^2 < \infty$ ), paired with increment control: $|z_{n+1} - z_n| \leq b_n (z_n + 1)$ where $\sum b_n^2 < \infty$ . Under additional mild drift conditions outside a bounded set, the following is established: $\lim_{n\to\infty} d(z_n, [0,B]) = 0 \text{ a.s.},$ with explicit rates and, crucially, new $L^p$ and high probability concentration bounds also obtained. This is particularly central in reinforcement learning, where $Q$ -learning with linear function approximation does not admit the classical summable noise structure.

Table: Key Contributions Over Prior Work

Method	a.s. convergence	High Prob. Bound	$L^p$ Rate	General Markovian/inhomog. Noise
Classical RS	✓ (point)	–	–	Only summable/noise, not applicable
Extended (this work, (Liu et al., 30 Sep 2025))	✓ (set, with rate)	✓	✓ ( $p\geq2$ )	✓

4. Quantitative Metastable Versions

Uniform quantitative control in the classical sense is precluded by logical barriers (e.g., the existence of Specker sequences), but a metastability approach—stemming from Tao's notion—allows explicit bounds on when the process exhibits $\varepsilon$ -stability within a finite interval of prescribed length. Given nonnegative integrable $(X_n)$ and

$\mathbb{E}[X_{n+1}|\mathcal{F}_n] \leq (1 + A_n)X_n - B_n + C_n,$

under moduli of boundedness, it is shown (Neri et al., 21 Oct 2024) that for any $\lambda, \varepsilon \in (0,1)$ and control $g:\mathbb{N}\to\mathbb{N}$ , there is $n\leq \Phi(\lambda, \varepsilon, g)$ so that

$P\left(\exists i,j \in [n, n+g(n)]: |X_i - X_j| \geq \varepsilon\right) < \lambda.$

Here, $\Phi$ is explicitly computable in terms of the bounds on the process.

This metastable approach enables "finitization" of classical convergence—guaranteeing not ultimate convergence, but that after explicit time, the process remains approximately stable for a prescribed length, with high probability.

5. Applications and Specializations

The broadened Robbins-Siegmund framework applies to:

Stochastic approximation: Classic and modern algorithms, including Robbins-Monro, SGD, and stochastic subgradient methods.
Reinforcement learning: Rigorous almost sure convergence, explicit rates, and concentration bounds are now established for $Q$ -learning with linear function approximation, addressing a central open problem in RL theory (Liu et al., 30 Sep 2025, Zhang, 5 Nov 2025).
Dvoretzky’s theorem and stochastic quasi-Fejér monotonicity: New uniform, effective rates in abstract metric and Hilbert spaces (Neri et al., 17 Apr 2025).
Confidence sequence methodology: Underpins constructions like Robbins’ confidence sequences, which control inferential contradiction probabilities over growing samples (2002.03658), and extended Ville's inequalities for broad sequential analysis tasks (Wang et al., 2023).

6. Theoretical and Methodological Implications

The current suite of extensions yields:

Uniform, modular framework: Both mean and almost sure convergence rates, portable to new stochastic iterative procedures and optimization algorithms.
Adaptivity: Results are not only applicable for i.i.d. noise but generalize naturally to Markovian and non-homogeneous noise, with explicit increment and drift controls.
Proof mining/logical methodology: Demonstrates the reach and limitations of uniform "constructive" rates, clarifying that metastable and set-level convergence are optimal under weak regularity.
Formalization: Recent works formally encode these results in proof assistants (Lean 4), making foundational reinforcement learning convergence theorems fully mechanizable and extensible (Zhang, 5 Nov 2025).

7. Future Directions and Open Problems

Outstanding questions include:

Further generalization of extended inequalities (e.g., to submartingales or continuous-time processes, as per the open ends of (Wang et al., 2023)).
Tight characterization of trade-offs between set-valued and pointwise convergence under weaker noise and increment conditions.
Full development of quantitative metastable and deterministic-to-stochastic transference principles as modular tools in optimization and learning theory.

Overall, the recent extensions of the Robbins-Siegmund theorem subsume a large family of convergence results for stochastic iterative methods with explicit, constructive, and portable rates—addressing long-standing theoretical and practical gaps across stochastic approximation, statistical inference, and machine learning.