Papers
Topics
Authors
Recent
Search
2000 character limit reached

Berry–Esseen Theorem Overview

Updated 11 March 2026
  • The Berry–Esseen theorem is a foundational result in probability that quantifies the convergence rate of the central limit theorem by comparing normalized sums to the Gaussian distribution.
  • It employs Fourier-analytic techniques and smoothing inequalities with explicit moment conditions to derive optimal bounds and sharp constants in various settings.
  • Extensions include multivariate, dependent, and functional cases, with recent research achieving faster convergence rates under enhanced moment and density assumptions.

The Berry–Esseen theorem is a fundamental result in probability theory providing explicit, non-asymptotic quantitative bounds for the convergence rate in the central limit theorem (CLT). Specifically, it quantifies, in terms of explicit moment conditions, the uniform difference between the distribution function of a normalized sum of independent random variables and the limiting Gaussian distribution. The theorem and its variants underpin precise assessments of normal approximation for sums of independent and, in many generalizations, dependent random variables and statistics. This article presents rigorous statements, methods of proof, optimality issues, and connections to higher-dimensional, functional, and specialized probabilistic regimes as established in recent research literature.

1. Classical Statement and Sharp Constants

Let X1,X2,,XnX_1, X_2, \dots, X_n be independent, identically distributed real random variables with zero mean, variance σ2>0\sigma^2 > 0, and finite third absolute moment ρ3=E[Xi3]\rho^3 = \mathbb{E}[|X_i|^3]. Define the normalized sum Sn=i=1nXiS_n = \sum_{i=1}^n X_i and its distribution function Fn(x)=P(Sn/(σn)x)F_n(x) = \mathbb{P}(S_n / (\sigma\sqrt{n}) \le x). Let Φ(x)\Phi(x) denote the standard normal distribution function. The classical Berry–Esseen theorem asserts the existence of a universal constant CC such that

supxRFn(x)Φ(x)CE[X13]σ3n.\sup_{x \in \mathbb{R}} |F_n(x) - \Phi(x)| \le C\,\frac{\mathbb{E}[|X_1|^3]}{\sigma^3 \sqrt{n}}.

Refined analysis yields C0.4748C \le 0.4748 (Esseen), with the sharpest constant in the i.i.d. case known to be C0.4097C\le0.4097 (Shevtsova, 2011) (Vershynin, 5 Feb 2026). For independent, non-identically distributed variables, the optimal constant is approximately $0.5600$.

2. Fourier-Analytic Techniques and Esseen's Smoothing Inequality

Central to most proofs is Esseen’s smoothing inequality, which relates the Kolmogorov distance between distribution functions to the difference of their characteristic functions. For real random variables X,YX, Y, and YY possessing a bounded density (MM), for any T>0T > 0,

supaRFX(a)FY(a)2πTTϕX(t)ϕY(t)tdt+CMT1.\sup_{a \in \mathbb{R}} |F_X(a) - F_Y(a)| \le \frac{2}{\pi} \int_{-T}^T \left| \frac{\phi_X(t) - \phi_Y(t)}{t} \right|\,dt + C M T^{-1}.

The proof proceeds via smoothing the indicator function with Schwartz-class mollifiers and careful control of the Fourier-analytic tail (Vershynin, 5 Feb 2026). Taylor expansion of the log-characteristic function facilitates tight local approximations, with large-tt errors addressed via uniform characteristic function bounds.

3. Multivariate Extensions and Explicit Constants

For independent mean-zero Rd\mathbb{R}^d-valued variables X1,,XnX_1,\ldots,X_n with Cov(Xi)=Id\operatorname{Cov}(X_i) = I_d, let Sn=i=1nXiS_n = \sum_{i=1}^n X_i, ZN(0,Id)Z \sim N(0, I_d), and define the convex-set Kolmogorov distance

Δn=supARd,convexP{SnA}P{ZA}.\Delta_n = \sup_{A \subset \mathbb{R}^d,\,\mathrm{convex}} \left| \mathbb{P}\{S_n \in A\} - \mathbb{P}\{Z \in A\} \right|.

Rač (Raič, 2018) established the explicit bound

Δn(42d1/4+16)i=1nE[Xi3],\Delta_n \le (42\,d^{1/4} + 16) \sum_{i=1}^n \mathbb{E}[\|X_i\|^3],

and, equivalently, via the Gaussian perimeter,

Δnmax{27,1+50Yd}i=1nE[Xi3],Yd<0.59d1/4+0.21.\Delta_n \le \max\{27,\,1 + 50 Y_d\} \sum_{i=1}^n \mathbb{E}[\|X_i\|^3], \quad Y_d < 0.59 d^{1/4} + 0.21.

The proof employs a variant of Stein's method, smoothing convex set indicators and tightly bounding the Gaussian perimeter using explicit constants. The d1/4d^{1/4} scaling in the dimension is optimal (see Nazarov's asymptotics), but whether the numerical coefficient $42$ is improvable remains open (Raič, 2018).

4. Fast Rates Under Regularity and Minimal Density Assumptions

Substantial recent progress demonstrates that the canonical Berry–Esseen O(n1/2)O(n^{-1/2}) rate is not universal. If the summands have additional moment-matching with the normal law (up to order k3k\geq 3) and the distribution possesses a small “rectangle” where its density is lower bounded by hh over a width ww, then

supsRP(X1++XNNs)Φ(s)C(k)E[Xk+1]N(k1)/2+3exp(chw3NE[Xk+1])\sup_{s \in \mathbb{R}} \left| \mathbb{P}\left( \frac{X_1 + \dotsb + X_N}{\sqrt{N}} \leq s \right) - \Phi(s) \right| \leq C(k)\,\frac{\mathbb{E}[|X|^{k+1}]}{N^{(k-1)/2}} + 3\,\exp\left(-c h w^3 \frac{N}{\mathbb{E}[|X|^{k+1}]}\right)

with universal c>0c > 0 (Johnston, 2023). For symmetric laws with finite fourth moment, the rate improves to O(1/N)O(1/N), sharply accelerating convergence compared to the classical regime. The density assumption is necessary—without it, lattice-type distributions (e.g., Bernoulli) saturate the O(n1/2)O(n^{-1/2}) barrier.

5. Dependent Data, U-Statistics, and Generalizations

The Berry–Esseen theorem has robust extensions to a wide range of dependent structures and complex statistics:

  • For locally dependent sequences and dependency graphs, the Kolmogorov error for sample quantiles is O((logn)/n)O( (\log n) / \sqrt{n} ) with explicit constants depending on local neighborhood sizes (Dey et al., 2022).
  • Uniform bounds for M-estimators of geometrically ergodic Markov chains, under regularity and moment-dominance conditions, are established at O(n1/2)O(n^{-1/2}), uniformly over parameter families (Hervé et al., 2012).
  • Generalized U-statistics and subgraph count statistics can admit Berry–Esseen rates of O(n1)O(n^{-1}) in regimes of combinatorial cancellation or strong connectivity, with the exchangeable-pair technique providing systematic control (Zhang, 2021).

6. Berry–Esseen in Stronger Metrics and Functional Settings

Recent advances provide quantitative Berry–Esseen rates in metrics stronger than Kolmogorov:

  • Sharp O(n1/2)O(n^{-1/2}) bounds hold in total variation distance for normalized sums of absolutely continuous independent variables with finite third absolute moment and finite relative entropy to Gaussian, and the rate can be improved to O(n1)O(n^{-1}) in relative entropy under a fourth-moment requirement (Bobkov et al., 2011).
  • In the uniform norm on local limit densities, for independent random vectors in Rd\mathbb{R}^d with bounded density, Lyapunov ratio, and maximal summand density MM, one obtains

supxpn(x)φ(x)CM2σB3,\sup_{x} |p_n(x) - \varphi(x)| \leq C M^2 \sigma B_3,

with precise dependencies on third-moment and density maxima (Bobkov et al., 2024).

  • For typical weighted sums with weak correlation, Kolmogorov distances decaying as (logn)/n(\log n)/\sqrt{n} or 1/n1/\sqrt{n} are available under suitable moment and small-ball constraints without the necessity of independence (Bobkov et al., 2017).

7. Specialized Models and Optimality: Random Matrix and Functional Cases

The Berry–Esseen theorem has been established for sophisticated models:

  • For the circular β\beta-ensemble and related random matrix models, the Kolmogorov distance for arc counting functions is bounded by O((logN)1/2)O((\log N)^{-1/2}), matching the scale of the logarithmic variance of linear statistics (Feng et al., 2019).
  • In the case of complex Wiener–Itô multiple integrals, optimal Berry–Esseen rates are available in the Wasserstein metric in terms of cumulant and contraction norms, extending the Fourth Moment Theorem and enabling sharp rates for statistics of the complex Ornstein–Uhlenbeck process (Chen et al., 2024).

8. Open Problems and Current Directions

Despite the extensive literature, several core questions remain open:

  • Determination of the optimal constants in various high- and infinite-dimensional Berry–Esseen inequalities remains a focus (Vershynin, 5 Feb 2026, Raič, 2018).
  • Sharp thresholds for transition between n1/2n^{-1/2} and n1n^{-1} convergence rates under minimal density smoothness conditions are both established and further investigated (Johnston, 2023, Bobkov et al., 2024).
  • Extensions to functional CLTs, multivariate convergence in non-Euclidean metrics, and sharp rates in dependent or non-classical probabilistic structures are active areas of research (Leppänen, 2024, Hervé et al., 2012).
  • The role of entropy and information-theoretic distances as quantitative proxies for CLT convergence beyond bounded variation and Kolmogorov metrics is developing (Bobkov et al., 2011).

Summary Table: Core Berry–Esseen Bounds and Regimes

Setting Kolmogorov Bound Metric Rate Paper
Classical i.i.d., 3rd moment CEX3σ3nC \frac{\mathbb{E}|X|^3}{\sigma^3 \sqrt{n}} Kolmogorov O(n1/2)O(n^{-1/2}) (Vershynin, 5 Feb 2026)
Multivariate, convex sets CdEXi3C_d \sum \mathbb{E}\|X_i\|^3 Convex Kolmogorov O(d1/4n1/2)O(d^{1/4} n^{-1/2}) (Raič, 2018)
4th moment match, local density CM2EX3/nC M^2 \mathbb{E}|X|^3/\sqrt{n} Uniform density O(n1/2)O(n^{-1/2}) (Bobkov et al., 2024)
3-moment match, minimal density O(n1)O(n^{-1}) (plus exp. small) Kolmogorov O(n1)O(n^{-1}) (Johnston, 2023)
Dependent (Markov, U-statistics, etc.) Cn1/2C n^{-1/2} Kolmogorov O(n1/2)O(n^{-1/2}) (1205.29472104.03479)
Entropic (TV, KL) O(n1/2)O(n^{-1/2}) or O(n1)O(n^{-1}) TV, KL, W2W_2 O(n1/2),O(n1)O(n^{-1/2}), O(n^{-1}) (Bobkov et al., 2011)
Circular β\beta-ensemble Cβ(logN)1/2C_\beta(\log N)^{-1/2} Kolmogorov O((logN)1/2)O((\log N)^{-1/2}) (Feng et al., 2019)

This compilation reflects the rigorous progression from the classical Berry–Esseen theorem for sums of independent random variables, through higher-dimensional, dependent, and structural generalizations, to advanced probabilistic, functional, and information-theoretic regimes. Each variant leverages precise smoothing and coupling methodologies, yielding explicit and, in the best cases, optimal constants. The theorem's role in probability theory remains central due to its quantitative precision and the wide scope of its generalizations.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Berry–Esseen Theorem.