Papers
Topics
Authors
Recent
Search
2000 character limit reached

Limiting spectral distributions of large consistent rank correlation matrices

Published 29 Apr 2026 in math.PR and math.ST | (2604.26396v1)

Abstract: We study random matrices whose entries are obtained by applying consistent rank correlations, such as Hoeffding's $D$, pairwise to a high-dimensional random vector with mutually independent components. Prior work has shown that, in the proportional high-dimensional regime, the empirical spectral distributions of large Kendall's tau and Spearman's rho matrices converge weakly almost surely to the Marchenko--Pastur law. By contrast, we prove that for consistent rank correlations such as Hoeffding's $D$, the limiting spectral distribution is given by the semicircle law. Our result thus generalizes a recent work of Dong, Han, and Yao (2025), who considered the special case of Chatterjee's rank correlation and established the first semicircle law for a large correlation matrix in the proportional regime.

Authors (3)

Summary

  • The paper shows that the limiting spectral distribution of large consistent rank correlation matrices converges to a Wigner semicircle law rather than the traditional Marchenko-Pastur law.
  • The methodology employs degenerate U-statistics and a Gram matrix representation to derive explicit radii for the semicircle law under proportional asymptotics.
  • Key implications include enhanced high-dimensional inference and robust nonparametric testing for independence using rank-based correlation measures.

Limiting Spectral Distributions for High-Dimensional Consistent Rank Correlation Matrices

Introduction and Motivation

This work addresses the limiting spectral distributions (LSDs) of large random matrices whose entries correspond to consistent rank correlations—specifically, nonparametric measures like Hoeffding's DD, Blum–Kiefer–Rosenblatt's RR, and Bergsma–Dassios–Yanagimoto's τ∗\tau^*. Unlike the extensively studied Pearson, Spearman, and Kendall correlation matrices, which are known to have LSDs that converge to the Marchenko–Pastur (MP) law in the high-dimensional proportional regime (p/n→γ∈(0,∞)p/n \to \gamma \in (0, \infty)), this paper demonstrates that for consistent rank correlations the LSD is instead the Wigner semicircle law, a surprising departure from the MP paradigm.

Problem Formulation and Main Results

Consider i.i.d. pp-dimensional vectors X1,…,Xn\mathbf{X}_1, \ldots, \mathbf{X}_n with mutually independent and continuous coordinates. The study focuses on the sample consistent rank correlation matrix R^n\widehat{\mathbf{R}}_n whose off-diagonal entries are given by bivariate, rank-based, U-statistics, and on its standardized version: W^n=n (R^n−Ip) .\widehat{\mathbf{W}}_n = \sqrt{n}\ (\widehat{\mathbf{R}}_n - \mathbf{I}_p)\ . The key finding, formalized for Hoeffding's DD, Blum–Kiefer–Rosenblatt's RR, and RR0, is the almost-sure weak convergence in ESD to a semicircle law, with non-universal but explicit radii.

Concretely, under proportional asymptotics, the following distributional limits are established: RR1 Here, RR2 denotes the semicircle law with radius RR3. Figure 1

Figure 1

Figure 1

Figure 1: ESDs of RR4, RR5, and RR6, with RR7, RR8, overlaid with the predicted semicircle limits; RR9.

Structural Insights and Theoretical Analysis

Universality Beyond the MP Law

The result sharply contrasts prior results for the LSDs of normalized Pearson, Spearman, and Kendall matrices, which are dominated by the MP law in the same asymptotic regime. The paper gives the first general class of statistically meaningful high-dimensional random correlation matrices with an LSD governed by the semicircle, rather than MP, law.

A crucial distinction is the degeneracy of the kernels for these U-statistics under the independence null, which causes a leading second-order term to determine the LSD. For the canonical (non-degenerate) U-statistics such as those underlying the Spearman and Kendall matrices, the leading term is analogous to a sample covariance structure, whose ESD falls under the MP universality. In contrast, the class of consistent rank correlations considered here—by virtue of their degeneracy and symmetry—give rise to a Gram structure analogous to Wigner-type matrices.

Detailed Gram Matrix Representation

The analysis deploys the Hoeffding (or von Mises) decomposition for degenerate U-statistics. The leading term can be expressed as a sum of quadratic forms, yielding an explicit high-dimensional Gram-type structure for the off-diagonal entries: τ∗\tau^*0 where τ∗\tau^*1 denotes diagonal removal and τ∗\tau^*2 is a matrix whose columns are constructed from projections of the data onto an orthonormal basis induced by the eigenstructure of the degenerate kernel.

This reveals that, up to negligible reshaping and scaling, the matrix behaves analogously to a Wigner-type ensemble, which, in the ultra-high-dimensional random matrix theory (RMT) setting, is well appreciated to converge in ESD to the semicircle law.

Technical Methods

The proof proceeds via the Stieltjes transform method and detailed control of dependence for the random Gram-type matrix. Truncation and approximation techniques reduce the analysis to a finite-rank setting, after which classical RMT arguments adapted from the Gram matrix literature (e.g., Bai-Yin, Bai-Silverstein) yield the desired limiting spectral law, with extra attention paid to the weak, structured dependencies induced by the U-statistic kernel's symmetry and rank-dependence properties.

Numerical Illustrations

The paper includes empirical histograms of the eigenvalue distributions of the considered standardized rank correlation matrices, demonstrating visually excellent fit to the predicted semicircle distributions across several choices of the consistent U-statistic kernel. The sharp alignment with the theoretical semicircular densities highlights the robustness and sharpness of the proven result.

(Figure 1) (repeated for context)

Figure 1: ESDs for the three main consistent rank correlation matrix types, overlaying histograms with the corresponding semicircle density curves.

The principal advance relative to [Dong et al., 2025] is the substantial generalization from Chatterjee's rank correlation to the entire class of consistent rank correlations, making the semicircle phenomenon robust to a broad class of dependence measures. Previous LSD results for U-statistic-based random matrices were confined to special cases, often failing to capture the phenomenon at this generality [MR4185806, MR3737306].

Moreover, the result highlights a clear dichotomy in spectral behavior: MP-type universality for non-degenerate rank-based or moment-based correlations, and semicircle-type universality for degenerate, symmetric, consistent rank correlations.

Implications and Prospective Directions

Theoretical Consequences

This work demonstrates that the spectral behavior of large random matrices constructed from valid, powerful independence statistics can dramatically depart from the covariance-dominated MP regime, suggesting new universality classes within high-dimensional RMT. The results indicate that degeneracy and symmetry in U-statistic kernels can induce semicircular limiting spectra. This insight opens avenues for examining other classes of nonparametric statistics with analogous structure and for further characterizing the landscape of possible high-dimensional spectral limits.

Practical Relevance

From a practical standpoint, understanding the LSD of large correlation matrices is fundamental for high-dimensional inference, particularly for hypothesis testing and the design of robust, nonparametric statistical procedures in large dimensions. The semicircle law here implies that for the considered rank-based correlations, null spectra in high-dimensions may feature more symmetric, less heavy-tailed distribution of eigenvalues than under MP, with consequences for testing signal detection and independence in large-scale applications.

Outlook for Further Research

Future work can extend to scenarios with dependent coordinates, non-continuous margins, or heavy-tailed settings, and can explore distributional limits outside the proportional regime. There is potential for new RMT universality classes arising from analogous constructions in multivariate dependence testing and kernel methods. Additionally, the spectral phenomena described have relevance for understanding the stability and eigenstructure of complex nonparametric estimators in modern machine learning and high-dimensional statistics.

Conclusion

This work establishes that for a wide class of consistent, symmetric, rank-based U-statistics—including the canonical Hoeffding τ∗\tau^*3, BKR τ∗\tau^*4, and BDY τ∗\tau^*5—the limiting spectral distribution of the resulting high-dimensional standardized correlation matrix is governed by the Wigner semicircle law. This stands in sharp contrast to the MP law characteristic of classical (non-degenerate) rank-based correlation matrices. The findings provide both new theoretical understanding of the RMT landscape for nonparametric dependence measures and practical insights for statistical inference in high dimensions (2604.26396).

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.