Papers
Topics
Authors
Recent
2000 character limit reached

Non-Asymptotic Sub-Gaussian Bound

Updated 21 November 2025
  • Non-asymptotic sub-Gaussian concentration bounds are probabilistic tools that provide explicit exponential tail decay for sums, norms, and quadratic forms of sub-Gaussian random variables.
  • They utilize techniques such as AMGF, epsilon-net arguments, and PAC-Bayes variational methods to deliver sharp, parameter-controlled bounds that capture dimension-dependent or dimension-free rates.
  • These bounds are pivotal in high-dimensional statistics, randomized algorithms, and optimization, offering reliable performance guarantees even for moderate sample sizes.

A non-asymptotic sub-Gaussian concentration bound characterizes the deviation probabilities for sums, norms, or quadratic forms of sub-Gaussian random variables and vectors in finite samples, with explicit, parameter-controlled exponential tail decay. These bounds quantify, for arbitrary sample size and confidence level, the probability that an observable deviates from its expectation by a prescribed amount, with constants and rates reflecting the sub-Gaussian nature of underlying distributions. Non-asymptotic analysis is critical for modern high-dimensional statistics, randomized algorithms, and theoretical computer science, and has seen technical innovations allowing sharper, dimension-dependent or dimension-free formulations.

1. Sub-Gaussian Random Variables, Vectors, and Norms

A real random variable XX is sub-Gaussian with variance proxy σ2\sigma^2 if its moment generating function (MGF) satisfies

E[eλX]exp(λ2σ22),λR.E[e^{\lambda X}] \leq \exp\left(\frac{\lambda^2 \sigma^2}{2}\right), \quad \forall \lambda\in\mathbb{R}.

For a vector XRnX\in\mathbb{R}^n, sub-Gaussianity requires that for every unit vector Sn1\ell \in S^{n-1}, the projection ,X\langle \ell, X \rangle is sub-Gaussian with the same proxy. Equivalently,

E[eλ,X]exp(λ2σ22),λR,Sn1.E[e^{\lambda\langle \ell, X\rangle}] \leq \exp\left(\frac{\lambda^2\sigma^2}{2}\right), \quad \forall\lambda\in\mathbb{R},\,\ell\in S^{n-1}.

Sub-Gaussianity implies sub-Gaussian tails: P(Xt)2exp(t22σ2),t0,P(|X| \geq t) \leq 2\exp\left(-\frac{t^2}{2\sigma^2}\right), \quad t \geq 0, and similarly for sums of independent centered sub-Gaussians S=i=1nXiS = \sum_{i=1}^n X_i with variance proxy Σ2=iσi2\Sigma^2 = \sum_i \sigma_i^2: P(St)exp(t22Σ2).P\left(S \geq t\right) \leq \exp\left(-\frac{t^2}{2\Sigma^2}\right). For vectors, the Euclidean norm X2\|X\|_2 is typically considered; its concentration is nontrivial due to geometric structure of Rn\mathbb{R}^n.

2. Non-Asymptotic Vector Norm Concentration via AMGF

"A New Proof of Sub-Gaussian Norm Concentration Inequality" (Liu et al., 18 Mar 2025) introduces the averaged moment generating function (AMGF)

Φn,λ(x):=EUnif(Sn1)[eλ,x],\Phi_{n,\lambda}(x) := E_{\ell \sim \mathrm{Unif}(S^{n-1})}[e^{\lambda\langle \ell,x\rangle}],

and its expectation, Mavg(λ):=EX[Φn,λ(X)]M_\mathrm{avg}(\lambda) := E_X[\Phi_{n,\lambda}(X)]. By rotational invariance and convexity arguments, for every ϵ(0,1)\epsilon \in (0,1) and xRnx\in\mathbb{R}^n,

Φn,λ(x)(1ϵ2)n/2eϵλx.\Phi_{n,\lambda}(x) \geq (1-\epsilon^2)^{n/2} \, e^{\epsilon\lambda \|x\|}.

Using Markov's inequality, the bound

P(X2>r)E[Φn,λ(X)]Φn,λ(rη)exp(λ2σ2/2)(1ϵ2)n/2eϵλrP(\|X\|_2 > r) \leq \frac{E[\Phi_{n,\lambda}(X)]}{\Phi_{n,\lambda}(r\eta)} \leq \frac{\exp(\lambda^2\sigma^2/2)}{(1-\epsilon^2)^{n/2}\,e^{\epsilon\lambda r}}

is optimized at λ=ϵr/σ2\lambda^* = \epsilon r / \sigma^2, yielding

P(X2>r)(1ϵ2)n/2exp(ϵ2r22σ2).P(\|X\|_2 > r) \leq (1-\epsilon^2)^{-n/2} \exp\left( -\frac{\epsilon^2 r^2}{2\sigma^2} \right).

Solving for rr to achieve confidence 1δ1-\delta, with any δ(0,1)\delta\in(0,1), one obtains the explicit non-asymptotic bound: X2σlog(1/(1ϵ2))ϵ2n+2ϵ2log(1/δ),\|X\|_2 \leq \sigma \sqrt{ \frac{ \log(1/(1-\epsilon^2)) }{ \epsilon^2 } n + \frac{2}{\epsilon^2} \log(1/\delta) }, with leading constant C1=log(1/(1ϵ2))/ϵ2C_1 = \log(1/(1-\epsilon^2))/\epsilon^2, which is strictly smaller than traditional ϵ\epsilon-net approaches. For instance, for ϵ=1/2\epsilon=1/2, C15.54C_1\approx5.54 versus 16\approx16 for the union-bound-based proof (Liu et al., 18 Mar 2025).

3. Classical Covering Arguments and Matrix Extensions

Traditional non-asymptotic vector/matrix concentration bounds employ ϵ\epsilon-net techniques over the unit sphere, resulting in union bounds with cardinality (1+2/ϵ)n(1+2/\epsilon)^n and dimension-dependent rates \cite{Vershynin2010}. For i.i.d. sub-Gaussian vectors or random matrices, operator norm and singular value concentration typically take the form: P(iXi>t)(m+n)exp(t22b2m),P(\|\sum_i X_i\| > t) \leq (m+n)\exp\left( -\frac{t^2}{2b^2 m} \right), where bb bounds the sub-Gaussian ψ2\psi_2-norm of entries, and m,nm,n are matrix dimensions (Gao et al., 2019). Recent matrix concentration works refine these to tighter two-regime bounds, offering exponentially small tails even for moderate deviations, and dimension-free rates for large deviations in suitable regimes (Gao et al., 2019, Vershynin, 2010).

Operator norm concentration can also be established for heteroskedastic Wishart-type matrices, with tight tail bounds in terms of maximal column, row, and element variance proxies (ΣC\Sigma_C, ΣR\Sigma_R, Σ\Sigma_*), matching minimax lower bounds (Cai et al., 2020).

4. Quadratic Forms and Hanson–Wright Inequality

The non-asymptotic Hanson–Wright inequality (Rudelson et al., 2013) provides tail bounds for quadratic forms Q(X)=XAXQ(X) = X^\top A X where XX has independent sub-Gaussian coordinates: P(Q(X)EQ(X)>t)2exp(cmin(t2K4AHS2,tK2A)),P\left( |Q(X) - E Q(X)| > t \right) \leq 2 \exp\left( -c\min\left( \frac{t^2}{K^4\|A\|^2_{HS}},\frac{t}{K^2\|A\|} \right) \right), where K=maxiXiψ2K = \max_i \|X_i\|_{\psi_2} and AHS\|A\|_{HS} is the Hilbert–Schmidt norm of AA. For the squared norm AX22\|AX\|_2^2, tail bounds of the same form hold, and deviations concentrate sharply around their expectation uniformly over finite nn (Rudelson et al., 2013).

5. Optimality, Lower Bounds, and Dimension-Free Variational Bounds

Recent advances guarantee sharp, non-asymptotic lower bounds matching upper exponential tails up to constants (Zhang et al., 2018), e.g.,

cexp(Ct2Σ2)P(St)exp(t22Σ2),c \exp\left(-C\frac{t^2}{\Sigma^2} \right) \leq P(S\ge t) \leq \exp\left(-\frac{t^2}{2\Sigma^2}\right),

with universal c,Cc,C. These bounds remain valid for weighted sums and heavy-tailed regimes interpolating sub-Gaussian and sub-Weibull cases (Zhang et al., 2021, Zhang et al., 2023).

Dimension-free forms are established by PAC-Bayes variational techniques, yielding self-normalized confidence ellipsoids for vector-valued stochastic processes, such as

Sτ(Vτ+U0)12logdet(Vτ+U0)det(U0)+2log(1/δ),\|S_\tau\|^2_{(V_\tau+U_0)^{-1}} \leq \log\frac{ \det(V_\tau+U_0) }{ \det(U_0) } + 2\log(1/\delta),

for any stopping time τ\tau, regularizer U0U_0, and confidence level 1δ1-\delta (Chugg et al., 8 Aug 2025). This structure avoids explicit d\sqrt{d} or dd factors typical in union-bound-based concentration and enables efficient high-dimensional inferential procedures.

6. Applications and Implications

Non-asymptotic sub-Gaussian concentration bounds underpin sharp sample complexity and threshold computations in high-dimensional inference, randomized numerical linear algebra, compressive sensing, bandit algorithms, and empirical mean estimation. In optimization, gradient-based algorithms under sub-Gaussian noise (e.g., Stochastic Mirror Descent) inherit explicit concentration rates for function value gaps and iterates, with dependence on accuracy, confidence, and sample size spelled out quantitatively (Paul et al., 2024). Statistical procedures, such as robust covariance matrix estimation (Tyler's and Maronna’s M-estimators), admit non-asymptotic guarantees with stronger-than-classical exponential rates for the sup-norm deviation of weights and operator norm (Romanov et al., 2022).

Operator and singular value bounds for random matrices, critical for high-dimensional statistics and signal processing, are non-asymptotically valid and yield dimension-dependent rates only through log-determinant or structural constants, not as explicit leading factors (Vershynin, 2010, Gao et al., 2019, Cai et al., 2020).

7. Comparative Table: Methods and Constants

Approach Dimensional Constant C1C_1 Structure
AMGF (spherical avg. MGF) (Liu et al., 18 Mar 2025) C1=log(1/(1ϵ2))/ϵ2C_1= \log(1/(1-\epsilon^2))/\epsilon^2 (e.g., 5.5\approx5.5 for ϵ=1/2\epsilon=1/2) Rotation-invariant, union-free
ϵ\epsilon-net & Union Bound C1=2log(1+2/ϵ)/ϵ2C_1=2\log(1+2/\epsilon)/\epsilon^2 (16\approx16 for ϵ=1/2\epsilon=1/2) Covering number, union bound
Hanson–Wright (Rudelson et al., 2013) N/A (dimension enters via AHS\|A\|_{HS}, A\|A\|) Quadratic forms, operator norm
Variational PAC-Bayes (Chugg et al., 8 Aug 2025) Implied via logdet\log\det only Dimension-free, log determinant
Matrix Bernstein (Vershynin, 2010, Gao et al., 2019) Prefactor (m+n)(m+n), exponent t2/(2b2m)t^2/(2b^2m) Operator norm, structure-dependent

The AMGF method (Liu et al., 18 Mar 2025) yields the smallest known leading constant for the dimension term among valid methods for sub-Gaussian vectors, and the dimension-free variational approach (Chugg et al., 8 Aug 2025) gives width only via the log-determinant term, not via explicit dimension factors.

References

Non-asymptotic sub-Gaussian concentration bounds continue to evolve, with refinements in constants, structure, and dimensional dependence matching both worst-case and typical behaviors in high-dimensional random systems.

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Non-Asymptotic Sub-Gaussian Concentration Bound.