Degenerate U-Statistics: Limits & Deviations

Updated 29 October 2025

Degenerate U-statistic-type processes are defined by symmetric, degenerate kernels whose first-order projections vanish, emphasizing higher-order contributions.
They exhibit precise self-normalized moderate deviations and a law of the iterated logarithm that adapts to heavy-tailed distributions under minimal moment conditions.
These results enhance statistical inference in high-dimensional settings and network analysis by isolating dominant eigen-components and ensuring robust, adaptive testing.

Degenerate U-statistic-type processes are probability-theoretic and statistical objects arising when considering statistics of the form

$U_n = \frac{1}{n(n-1)} \sum_{1 \leq i \neq j \leq n} h(X_i, X_j)$

where the kernel function $h$ is symmetric and degenerate, meaning that all first-order projections vanish, i.e., $\mathbb{E}[h(X_1, y)] = 0$ for all $y$ . Such processes play a fundamental role in nonparametric statistics, random graph theory, high-dimensional testing, stochastic geometry, and statistical learning, and exhibit complex limit and deviation properties that differ markedly from the non-degenerate (ordinary CLT) case. Modern research addresses their moderate deviation probabilities, almost sure growth (laws of iterated logarithm), and their control in heavy-tailed regimes, with particular interest in "self-normalized" versions that enable sharp results under minimal moment assumptions.

1. Canonical Structure and Degeneracy Conditions

A degenerate U-statistic of order two is defined by a kernel of the form: $h(x, y) = \sum_{l=1}^\infty \lambda_l g_l(x) g_l(y)$ where $\lambda_l > 0$ , $\sum_{l=1}^\infty \lambda_l < \infty$ , and $\mathbb{E}[g_l(X_1)] = 0$ for all $l$ . Each $g_l(X_1)$ lies in the domain of attraction of a normal law, i.e.,

$L_l(x) := \mathbb{E}[g_l^2(X_1) \, 1_{\{|g_l(X_1)| \leq x\}}]$

is slowly varying as $x \to \infty$ . The degeneracy here ensures that the "linear" or non-degenerate part of the Hoeffding decomposition is absent, forcing higher-order structure to dominate the limiting distributions and deviation probabilities.

Such kernels admit an orthogonal (Karhunen-Loève) expansion in $L^2(F \times F)$ , where $F$ is the common marginal distribution of the i.i.d. observations $X_i$ . The variance structure and large deviation behavior of $U_n$ are then naturally determined by the dominant eigenfunctions and associated quadratic forms. Key technical assumptions supplement this with conditions on cross-covariances—ensuring that the sum

$\sum_{l=1}^\infty \lambda_l < \infty$

and further that for all $l \neq k$ , the normalized cross-covariances

$\lim_{n \to \infty} \frac{\mathbb{E}[g_l(X_1) 1_{\{|g_l(X_1)| \leq z_{n,l}\}} g_k(X_1) 1_{\{|g_k(X_1)| \leq z_{n,k}\}}]}{\sqrt{L_l(z_{n,l}) L_k(z_{n,k})}} > 0$

(for suitable truncations $z_{n,l}$ ) remain strictly positive, which in turn guarantees non-degenerate limiting covariance structure under minimal moment conditions.

2. Self-Normalized Moderate Deviations

The principal result on self-normalized moderate deviations states that for sequences $x_n \to \infty$ with $x_n = o(\sqrt{n})$ ,

$\log \mathbb{P} \left( \frac{\sum_{1 \leq i \neq j \leq n} h(X_i, X_j)}{\max_{l} \lambda_l V^2_{n,l}} \geq x_n^2 \right) \sim -\frac{x_n^2}{2}$

where

$V^2_{n,l} := \sum_{i=1}^n g_l^2(X_i).$

This quantifies the probability of large self-normalized fluctuations of the degenerate U-statistic, and is a direct analogue—yet distinct in dependence structure—to classical Cramér-type moderate deviations for normalized sums. The self-normalization here is essential: dividing by the random variance proxy $\max_l \lambda_l V^2_{n,l}$ both adapts to possibly infinite or heavy-tailed variances and ensures sharp exponential decay, even in the absence of third moments or finite variances.

Technical Steps

By truncating the variables and exploiting the degeneracy of the kernel, the analysis decomposes the sum into orthogonal components, with concentration dominated by the largest variance term.
Exponential inequalities and decoupling techniques are applied to control the maximal deviation for each eigen-component under minimal truncation assumptions.
Crucially, the behavior is captured by the maximum (over $l$ ) of the quadratic forms $\lambda_l V^2_{n,l}$ , identifying the "dominant subspace" responsible for large deviations (a phenomenon not present in linear statistics).

This result fills a notable gap: previous moderate deviation theorems for self-normalized statistics, such as those for sums or non-degenerate U-statistics, required substantially stronger moment or boundedness conditions and did not generalize to the highly dependent form of degenerate U-terms.

3. Law of the Iterated Logarithm for Self-Normalized Degenerate U-Statistics

The law of the iterated logarithm (LIL) is established for the same self-normalized process: $\limsup_{n \to \infty} \frac{\sum_{1 \leq i \neq j \leq n} h(X_i, X_j)}{\max_{l} \lambda_l V^2_{n,l} \cdot \log \log n} = 2 \quad \text{a.s.}$ which gives an almost sure upper envelope for the process and confirms that the maximal growth of the self-normalized degenerate U-statistic is controlled by the dominant quadratic variance over logarithmic iterates.

This result strictly generalizes the classical LIL (e.g., for normalized sums) to the degenerate U-statistics under heavy tails, and it reveals the same multiplicative constant (2) as in the classical case.

4. Minimal Moment Assumptions and Heavy-Tailed Adaptivity

The self-normalized approach renders the analysis robust to heavy tails, requiring only that each $g_l(X_1)$ be in the domain of attraction of a normal law (not necessarily finite variance)—a substantial weakening of traditional moment hypotheses. No finite third or even second moment is needed. This leverages a truncation technique and slow variations in the conditional variances.

As a result:

Cases such as $h(x, y) = xy$ (i.e., the Davis momentless LIL for sums) are recovered,
More generally, for highly non-linear or quadratic statistics, the same self-normalized large deviation regime is accessible, even if the individual variables are far from sub-Gaussian,
The variance proxy $\max_l \lambda_l V^2_{n,l}$ adapts automatically to the heaviest-tailed or most-variant eigenspace.

This extends universality to degenerate U-statistics and provides theoretical justification for practice in heavy-tailed empirical settings.

5. Implications for Dependence Structure and Applications

These advances directly impact theory and practice in high-dimensional and network settings:

In high-dimensional or random graph statistics, degenerate U-statistics naturally arise (e.g., counts of subgraph configurations, motif moments), and their limiting behavior governs signal detection and testing thresholds in both parametric and nonparametric inference.
Self-normalization guarantees valid inference for degenerate, quadratic, or even more highly structured U-statistics under minimal tail assumptions, providing tools for random graph property testing, resampling, and inference in machine learning algorithms based on pairwise similarity or kernel methods.
The identification of the dominant eigenspace ( $\max_{l} \lambda_l V^2_{n,l}$ ) in moderate deviations offers insight into which structural aspect of the data or kernel is responsible for extreme events, and facilitates the design of robust statistical tests and adaptive inference procedures.

Summary Table: Key Self-Normalized Results

Property	Statement	Condition
Moderate deviation	$\log P( W_n \ge x_n^2 ) \sim -x_n^2/2$	$x_n \to \infty, x_n = o(\sqrt{n})$
Law of iterated logarithm	$\limsup_{n \to \infty} W_n / \log\log n = 2$ a.s.	For i.i.d. $X_i$ , domain of attraction
Kernel assumptions	$h(x,y) = \sum \lambda_l g_l(x)g_l(y)$ , $\sum \lambda_l < \infty$ , minimal moments	See above
Universality	Same form as sums for self-normalized case	Degenerate U-statistics, domain of attr.

6. Broader Context and Technical Innovations

Self-normalized large deviations for degenerate U-statistics extend principles from linear statistics to the non-linear, dependent regime (U-statistics with degeneracy), providing the same sharp moderate exponential rate and LIL quantifiers as for sums, but under the minimal restrictions adapted by self-normalization. The proof architecture exploits truncation, decoupling, and conditional variance-extraction—techniques that handle both dependence and heavy-tailed components.

This framework is expected to have primary relevance in:

High-dimensional statistics and nonparametric testing, where degenerate U-statistics form the core of modern procedures,
Network data analysis, where motif-based statistics are typically degenerate and may be sensitive to heavy-tailed behavior,
Adaptive inference (resampling, bootstrapping) in situations involving degenerate or quadratic forms in observed data.

The results furnish asymptotically sharp quantifications of risk and maximal fluctuation in degenerate U-statistic processes, enabling both theoretical progress and practical robustness in modern statistical methodologies.

PDF Markdown Chat (Pro)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Degenerate U-Statistic-Type Processes.