Esseen's Anti-Concentration Bound in Tensors
- Esseen's anti-concentration bound is a precise quantitative measure of the Kolmogorov distance between a normalized tensor sum and a Gaussian distribution.
- The framework leverages intrinsic parameters such as oscillation, partial-sum seminorms, and correlation measures to extend classical limit theorems to high-dimensional and degenerate settings.
- The approach unifies results from Esseen, Bolthausen, and Barbour–Chen by using exchangeable-pair coupling and Stein’s method to provide robust error estimates.
The anti-concentration bound of Esseen describes a quantitative, non-asymptotic estimate for the Kolmogorov distance between a normalized statistic and the Gaussian, highlighting the extent to which a linear combination of symmetric, exchangeable random variables (or tensors) avoids being too concentrated around any single value. The work of Dodos–Tyros (Dodos et al., 2022) establishes sharp anti-concentration (Berry–Esseen-type) bounds for sums of the form , where is a random tensor with strong symmetry, and provides explicit error terms in terms of intrinsic parameters. This framework recovers and extends classical results—such as Esseen’s and Bolthausen’s theorems—to arbitrary tensor order , encompassing high-dimensional and degenerate regimes and elucidating the transition between classical independence and combinatorial dependence structures.
1. Framework and Statement of the Anti-Concentration Bound
Let ; a real-valued random tensor; and a deterministic tensor (vector of coefficients). The main object is the sum
Let and suppose a nondegeneracy condition on a parameter :
for some , with and , as defined below.
The anti-concentration (Kolmogorov) bound [Theorem 1.4; (Dodos et al., 2022)] is:
with , , given explicitly by:
2. Parameter Definitions
The error terms depend on several intrinsic parameters and seminorms:
- -Partial Sum Seminorm: For ,
Special cases: (total sum), (Euclidean norm).
- Exchangeability/Correlation Parameters : For ,
Notably, .
- Finite-population Hoeffding analogues : For ,
- Oscillation: The L deviation of coordinate block averages,
- Global Mean-Deviation:
- Explicit Constant:
3. Structural Hypotheses and Nondegeneracy
The bound requires the following properties for and :
- (A1) Moment and variance control: , ,
- (A2) Symmetry, exchangeability, diagonal-free:
- invariant under coordinate permutations
- Distribution of invariant under any permutation of
- if has repeated coordinates
- (A3) Identical symmetry and diagonal-free assumptions for
- Nondegeneracy: must be bounded below as specified above to avoid division by nearly zero denominators.
These structural conditions generalize beyond the i.i.d. setup to highly dependent, symmetric arrays where standard independence-based CLTs do not apply.
4. Connections to Classical Esseen, Bolthausen, and Barbour–Chen Bounds
The Dodos–Tyros bound generalizes several pivotal prior results:
- i.i.d. (Esseen/Berry–Esseen):
for third-moment finite, independent entries.
- Bolthausen’s combinatorial CLT: For order-1 permutation statistics, e.g. sums , the optimal rate is
- Barbour–Chen: For two-dimensional permutation U-statistics, the bound is
For , the Dodos–Tyros term matches the “linear” Berry–Esseen rates, while captures the degenerate variance–ratio perturbation, merging these regimes in a unified framework. For , the bound accommodates further tensor structure.
5. Combinatorial Central Limit Theorem for High-Dimensional Tensors
The key methodological advance is a combinatorial CLT tailored to random tensors and permutation statistics [Theorem 2.2]:
Let be Hoeffding-type symmetric, zero-average kernels with and . The statistic
where , obeys
with (Bolthausen’s constant), .
The proof constructs an exchangeable-pair coupling via random transpositions and exploits Stein’s method—ensuring the required linearity and variance control—supported by a generalized Hoeffding multi-index variance decomposition and direct moment bounds. The approach extends Barbour–Chen’s Stein concentration-inequality methods to high-rank tensors.
6. Optimal Regimes and Theoretical Implications
Sharpness and optimality of the anti-concentration bound depend on intricate relationships between independence, degeneracy, and symmetry:
- Oscillation-dominated regime: If is large (weak dissociation), dominates and cannot be improved beyond .
- Nearly-linear regime: If are almost independent and all (for ), matches Bolthausen’s term.
- Partially degenerate regime: If degeneracy at some is present but small, yields an error smaller than , as in U-statistics of small effective rank.
- High-dimensional/mixed regime: For large , regimes interpolate smoothly between independence and fully degenerate (Hoeffding) structures.
For i.i.d. entries, the bound recovers the classical Berry–Esseen rates when is small, and Bolthausen’s rate when . Fully degenerate U-statistics () achieve even faster rates in small effective-rank settings.
7. Significance and Extensions
The anti-concentration bound of Esseen for random tensors unifies and extends statistical normal approximation in highly symmetric, exchangeable, and high-dimensional settings. The explicit dependence on the oscillation, correlation, mean-deviation, and partial-sum seminorms provides practically computable error estimates that are minimax optimal in several key regimes. This anti-concentration framework is instrumental in analyzing linear and nonlinear permutation statistics, and provides a rigorous foundation for statistical inference in combinatorial and high-order data analytic scenarios (Dodos et al., 2022).