Berry–Esseen Theorem for Non-i.i.d. Variables

Updated 9 February 2026

Berry–Esseen theorem for non-i.i.d. variables is an extension of the classical result that quantifies the error in normal approximation with explicit bounds for heterogeneous and dependent structures.
It employs methodologies such as Fourier analysis, Stein's method, coupling, and dependency graphs to derive optimal rates under various moment conditions and distance metrics.
The framework enhances error estimates for complex models including U-statistics, local dependence, and weakly dependent time series, aiding precise probabilistic approximations.

The Berry-Esseen theorem provides explicit quantitative rates for the central limit theorem by bounding the distance between the distribution function of a standardized sum of random variables and the standard normal distribution. The extension of this theorem to non-i.i.d. random variables is a mature area, covering a diverse range of dependence structures, moment conditions, and metrics, including Kolmogorov, total variation, relative entropy, and chi-square distances. Principle methodologies include Fourier analysis, Stein's method, coupling and concentration, dependency graphs, and chaos/expansion approaches.

1. Classical Berry–Esseen Bound and the Non-i.i.d. Setting

For a sum $S_n = X_1 + \cdots + X_n$ of independent, not necessarily identically distributed random variables with $\mathbb{E} X_j = 0$ , $\mathrm{Var}(X_j) = \sigma_j^2 > 0$ , and $\mathbb{E}|X_j|^3 < \infty$ , the Berry–Esseen theorem asserts

$\sup_{x \in \mathbb{R}} \left| \mathbb{P}\left( \frac{S_n}{(\sum \sigma_j^2)^{1/2}} \le x \right) - \Phi(x) \right| \le C \frac{\sum_j \mathbb{E}|X_j|^3}{(\sum_j \sigma_j^2)^{3/2}}$

where $\Phi$ is the standard normal cdf and $C$ is an absolute constant, with sharp estimates $0.4748 < C < 0.5600$ known for the constant in the non-i.i.d. case (Vershynin, 5 Feb 2026, Pinelis, 2013, Klartag et al., 2010).

This form is robust: the dependence on the Lyapunov ratio is known to be optimal in order. The proof via Fourier analysis (Esseen's smoothing inequality) extends directly to the non-i.i.d. case, exploiting independence for factorization of characteristic functions (Vershynin, 5 Feb 2026).

Classical extensions of the Berry–Esseen theorem for non-i.i.d. sequences include results in Kolmogorov, total variation, entropy, and $\chi^2$ metrics:

Kolmogorov Distance (Uniform and Non-uniform): The nonuniform Berry–Esseen bound strengthens the error in central and tail zones as

$|F_n(x) - \Phi(x)| \le C_{nu} \frac{\sum_i \mathbb{E}|X_i|^3}{(\sum_i \sigma_i^2)^{3/2}} \frac{1}{1 + x^3}$

with currently best-proven general-case bound $\mathbb{E} X_j = 0$ 0 (Pinelis, 2013).

Total Variation and Entropic Bounds: For independent but non-identically distributed random variables with bounded entropic distance $\mathbb{E} X_j = 0$ 1, Bobkov–Chistyakov–Götze (Bobkov et al., 2011) showed

$\mathbb{E} X_j = 0$ 2

with stronger norm but same rate in the Lyapunov ratio.

$\mathbb{E} X_j = 0$ 3 and Rényi Distances: If the $\mathbb{E} X_j = 0$ 4 have densities with matching moments to the Gaussian up to order $\mathbb{E} X_j = 0$ 5, in the polynomial-density case,

$\mathbb{E} X_j = 0$ 6

thereby gaining a rate improvement by moment matching (Delplancke et al., 2017).

Chaos Expansions: General functionals and U-statistics of independent variables, including non-i.i.d. cases, yield

$\mathbb{E} X_j = 0$ 7

recovering classical rates under standard moment hypotheses (Privault et al., 2020).

3. Dependency Graphs and General Dependent Structures

A significant direction concerns summands with weak dependencies encoded via a sparsity graph.

Dependency Graphs: Consider $\mathbb{E} X_j = 0$ 8 with means $\mathbb{E} X_j = 0$ 9, variances $\mathrm{Var}(X_j) = \sigma_j^2 > 0$ 0, and finite $\mathrm{Var}(X_j) = \sigma_j^2 > 0$ 1-th moments ( $\mathrm{Var}(X_j) = \sigma_j^2 > 0$ 2), and a dependency graph $\mathrm{Var}(X_j) = \sigma_j^2 > 0$ 3 of maximal degree $\mathrm{Var}(X_j) = \sigma_j^2 > 0$ 4 such that disjoint non-neighbor sets index independent subfamilies. Define

$\mathrm{Var}(X_j) = \sigma_j^2 > 0$ 5

with $\mathrm{Var}(X_j) = \sigma_j^2 > 0$ 6.

The Kolmogorov distance satisfies (Janisch et al., 2022)

$\mathrm{Var}(X_j) = \sigma_j^2 > 0$ 7

with explicit constants for all $\mathrm{Var}(X_j) = \sigma_j^2 > 0$ 8, improving over Stein-method bounds with much worse degree dependence.

The classical rate $\mathrm{Var}(X_j) = \sigma_j^2 > 0$ 9 is immediately recovered when $\mathbb{E}|X_j|^3 < \infty$ 0.

Local Dependence: Under a local dependency structure (e.g., dependency neighborhoods/blocks), the error decomposes with explicit control via local indices, e.g.,

$\mathbb{E}|X_j|^3 < \infty$ 1

where $\mathbb{E}|X_j|^3 < \infty$ 2 and $\mathbb{E}|X_j|^3 < \infty$ 3 encode the local graph-neighborhood sizes (Cai et al., 2 Feb 2026).

4. Weak Dependence, Markov Chains, and Dynamical Systems

Stationary Sequences and Time Series: For strictly stationary mean-zero processes, under a weak-dependence coefficient $\mathbb{E}|X_j|^3 < \infty$ 4 satisfying $\mathbb{E}|X_j|^3 < \infty$ 5,

$\mathbb{E}|X_j|^3 < \infty$ 6

for $\mathbb{E}|X_j|^3 < \infty$ 7, thus interpolating between independence ( $\mathbb{E}|X_j|^3 < \infty$ 8, $\mathbb{E}|X_j|^3 < \infty$ 9) and more weakly dependent cases (Jirak, 2016).

Spectral Methods: For weakly dependent Markov/non-Markov sequences with controlled cumulants and spectral gap, non-uniform Berry–Esseen bounds and Edgeworth expansions yield

$\sup_{x \in \mathbb{R}} \left| \mathbb{P}\left( \frac{S_n}{(\sum \sigma_j^2)^{1/2}} \le x \right) - \Phi(x) \right| \le C \frac{\sum_j \mathbb{E}|X_j|^3}{(\sum_j \sigma_j^2)^{3/2}}$ 0

and analogues for transport metrics and higher-order approximations (Hafouta, 2022).

Polynomial Densities and Markovian Recursion: For independent, non-identically distributed $\sup_{x \in \mathbb{R}} \left| \mathbb{P}\left( \frac{S_n}{(\sum \sigma_j^2)^{1/2}} \le x \right) - \Phi(x) \right| \le C \frac{\sum_j \mathbb{E}|X_j|^3}{(\sum_j \sigma_j^2)^{3/2}}$ 1 with density admitting a Hermite expansion and matching moments up to order $\sup_{x \in \mathbb{R}} \left| \mathbb{P}\left( \frac{S_n}{(\sum \sigma_j^2)^{1/2}} \le x \right) - \Phi(x) \right| \le C \frac{\sum_j \mathbb{E}|X_j|^3}{(\sum_j \sigma_j^2)^{3/2}}$ 2, and subject to polynomial-density growth conditions, Markovian spectral analysis gives (Delplancke et al., 2017)

$\sup_{x \in \mathbb{R}} \left| \mathbb{P}\left( \frac{S_n}{(\sum \sigma_j^2)^{1/2}} \le x \right) - \Phi(x) \right| \le C \frac{\sum_j \mathbb{E}|X_j|^3}{(\sum_j \sigma_j^2)^{3/2}}$ 3

which is optimal under these hypotheses.

5. Stein’s Method, Exchangeable Pairs, and Coupling

Stein's Method for Local and Global Dependence: Both size-bias coupling (Goldstein, 2010) and exchangeable pairs (even with unbounded jumps) (Shao et al., 2017), allow explicit Kolmogorov and Wasserstein bounds for wide classes including dependency graphs.

For exchangeable pairs $\sup_{x \in \mathbb{R}} \left| \mathbb{P}\left( \frac{S_n}{(\sum \sigma_j^2)^{1/2}} \le x \right) - \Phi(x) \right| \le C \frac{\sum_j \mathbb{E}|X_j|^3}{(\sum_j \sigma_j^2)^{3/2}}$ 4 with regression condition $\sup_{x \in \mathbb{R}} \left| \mathbb{P}\left( \frac{S_n}{(\sum \sigma_j^2)^{1/2}} \le x \right) - \Phi(x) \right| \le C \frac{\sum_j \mathbb{E}|X_j|^3}{(\sum_j \sigma_j^2)^{3/2}}$ 5,

$\sup_{x \in \mathbb{R}} \left| \mathbb{P}\left( \frac{S_n}{(\sum \sigma_j^2)^{1/2}} \le x \right) - \Phi(x) \right| \le C \frac{\sum_j \mathbb{E}|X_j|^3}{(\sum_j \sigma_j^2)^{3/2}}$ 6

controls the rate, with all terms explicit (Shao et al., 2017). This approach is robust to dependent structures provided an appropriate pair and remainder can be constructed.

6. Non-i.i.d. Rates for Functionals and U-Statistics

Hoeffding Decomposable U-Statistics: Berry–Esseen bounds extend to degenerate and non-degenerate U-statistics of independent non-identically distributed random variables, given by

$\sup_{x \in \mathbb{R}} \left| \mathbb{P}\left( \frac{S_n}{(\sum \sigma_j^2)^{1/2}} \le x \right) - \Phi(x) \right| \le C \frac{\sum_j \mathbb{E}|X_j|^3}{(\sum_j \sigma_j^2)^{3/2}}$ 7

with $\sup_{x \in \mathbb{R}} \left| \mathbb{P}\left( \frac{S_n}{(\sum \sigma_j^2)^{1/2}} \le x \right) - \Phi(x) \right| \le C \frac{\sum_j \mathbb{E}|X_j|^3}{(\sum_j \sigma_j^2)^{3/2}}$ 8 the first projection in the Hoeffding decomposition (Privault et al., 2020, Cai et al., 2 Feb 2026).

Weighted Sums: For sums with non-uniform weights,

$\sup_{x \in \mathbb{R}} \left| \mathbb{P}\left( \frac{S_n}{(\sum \sigma_j^2)^{1/2}} \le x \right) - \Phi(x) \right| \le C \frac{\sum_j \mathbb{E}|X_j|^3}{(\sum_j \sigma_j^2)^{3/2}}$ 9

thus providing sharp error control for linear combinations (Privault et al., 2020, Klartag et al., 2010).

7. Non-uniform and High-dimensional Berry–Esseen Bounds

Non-uniform Bounds in Tails: For the non-i.i.d. case, best-known results provide

$\Phi$ 0

with $\Phi$ 1 the Lyapunov ratio. Pinelis (Pinelis, 2013) and refinements (Pinelis, 2011) showed exponential decay in tails by combining Stein’s equation with Chen–Shao concentration,

$\Phi$ 2

where $\Phi$ 3 and $\Phi$ 4 are parameters, allowing optimization for small or large deviations.

Multivariate Extensions: For independent, non-identically distributed vectors in $\Phi$ 5, if $\Phi$ 6, the error for convex sets $\Phi$ 7 satisfies

$\Phi$ 8

with explicit constants (Raič, 2018).

References (arXiv IDs)

(Janisch et al., 2022) Berry-Esseen-type estimates for random variables with a sparse dependency graph
(Cai et al., 2 Feb 2026) Refined Berry-Esseen bounds under local dependence
(Privault et al., 2020) Berry-Esseen bounds for functionals of independent random variables
(Jirak, 2016) Berry-Esseen theorems under weak dependence
(Pinelis, 2013) On the nonuniform Berry--Esseen bound
(Bobkov et al., 2011) Berry-Esseen bounds in the entropic central limit theorem
(Pinelis, 2011) Improved nonuniform Berry--Esseen-type bounds
(Vershynin, 5 Feb 2026) A friendly proof of the Berry-Esseen theorem
(Klartag et al., 2010) Variations on the Berry-Esseen theorem
(Goldstein, 2010) A Berry-Esseen bound with applications to vertex degree counts in the Erdős-Rényi random graph
(Raič, 2018) A multivariate Berry--Esseen theorem with explicit constants
(Hafouta, 2022) Non-uniform Berry-Esseen theorem and Edgeworth expansions with applications to transport distances for weakly dependent random variables
(Delplancke et al., 2017) Berry-Esseen bounds for the chi-square distance in the Central Limit Theorem: a Markovian approach
(Shao et al., 2017) Berry-Esseen Bounds of Normal and Non-normal Approximation for Unbounded Exchangeable Pairs
(Dey et al., 2022) Berry-Esseen Theorem for Sample Quantiles with Locally Dependent Data

The modern theory of the Berry–Esseen theorem for non-i.i.d. random variables encompasses a hierarchy of models from independence with heterogeneous distributions, local/global dependences (dependency graphs, block structures), weak/mixing dependencies (Markov/dynamical systems), and non-linear functionals (U-statistics, vertex counts, quantiles). Across these settings, sharp rates are controlled by moment conditions, explicit dependency/coupling indices, and, in many cases, obtain optimal constants or tail-adaptive rates for both uniform and tail (non-uniform) distances to normality.