Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 91 tok/s
Gemini 2.5 Pro 58 tok/s Pro
GPT-5 Medium 29 tok/s
GPT-5 High 29 tok/s Pro
GPT-4o 102 tok/s
GPT OSS 120B 462 tok/s Pro
Kimi K2 181 tok/s Pro
2000 character limit reached

Ties, Tails and Spectra: On Rank-Based Dependency Measures in High Dimensions (2508.14992v1)

Published 20 Aug 2025 in math.ST, math.PR, and stat.TH

Abstract: This work is concerned with the limiting spectral distribution of rank-based dependency measures in high dimensions. We provide distribution-free results for multivariate empirical versions of Kendall's $\tau$ and Spearman's $\rho$ in a setting where the dimension $p$ grows at most proportionally to the sample size $n$. Although rank-based measures are known to be well suited for discrete and heavy-tailed data, previous works in the field focused mostly on the continuous and light-tailed case. We close this gap by imposing mild assumptions and allowing for general types of distributions. Interestingly, our analysis reveals that a non-trivial adjustment of classical Kendall's $\tau$ is needed to obtain a pivotal limiting distribution in the presence of tied data. The proof for Spearman's $\rho$ is facilitated by a result regarding the limiting eigenvalue distribution of a general class of random matrices with rows on the Euclidean unit sphere, which is of independent interest. For instance, this finding can be used to derive the limiting spectral distribution of sample correlation matrices, which, in contrast to most existing works, accommodates heavy-tailed data.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

  • The paper derives limiting spectral distributions for rank-based measures (Kendall's τ and Spearman's ρ) under varying high-dimensional asymptotics.
  • It rigorously characterizes eigenvalue convergence to the semicircle or Marcenko-Pastur laws depending on the ratio of sample size to dimension.
  • The modified approaches address issues with ties and heavy tails, enhancing robustness in non-parametric statistical analysis.

Ties, Tails, and Spectra: On Rank-Based Dependency Measures in High Dimensions

Introduction

The paper "Ties, Tails and Spectra: On Rank-Based Dependency Measures in High Dimensions" (2508.14992) investigates the spectral properties of rank-based dependency measures, specifically focusing on Kendall's τ\tau and Spearman's ρ\rho in high-dimensional settings. The paper addresses the gap in existing literature by accommodating tied and heavy-tailed data, contrasting with prior work mainly dealing with continuous distributions. Through rigorous analysis, the authors provide a detailed characterization of the eigenvalue distributions under these conditions. Figure 1

Figure 1: Normalized histograms of the simulated eigenvalues of n/p\sqrt{n/p} \cdot (left panel) and n/poffdiag(τ)\sqrt{n/p}\cdot offdiag(\boldsymbol{\tau}) (right panel), showcasing limiting behaviors.

Eigenvalue Distributions of Dependency Measures

Preliminaries

The analysis assumes a setup where the dimension pp grows at most proportionally with the sample size nn. The main goal is to identify the limiting spectral distributions (LSDs) of the rank-based measures as pp and nn tend toward infinity. Two standard asymptotic frameworks considered are:

  1. p/n0p/n \rightarrow 0, leading to a semicircle distribution.
  2. p/nγ>0p/n \rightarrow \gamma > 0, resulting in the Marcˇ\check{\text{c}}enko-Pastur distribution.

The paper thoroughly derives the LSDs for rank-based measures under these regimes, providing a unifying approach for both discrete and continuous data.

Spearman's ρ\rho

Spearman's ρ\rho is redefined in the context of matrix form where row vectors lie on the Euclidean unit sphere. The paper proves that for p/n0p/n \rightarrow 0 or p/nγp/n \rightarrow \gamma, the LSDs under these configurations align with semicircle and Marcˇ\check{\text{c}}enko-Pastur laws, respectively, covering cases previously handled only for continuous data distributions. Figure 2

Figure 2: Distribution shifts of Spearman's ρ\rho across various dimensional growth scenarios.

Kendall's τ\tau

Kendall's τ\tau is adapted to address limitations with discrete data by using an adjusted version that yields a universal LSD. The modifications ensure robustness to ties, and results show that modulo scaling, the LSDs for Kendall's τ\tau under similar asymptotic conditions also align with the semicircle and Marcˇ\check{\text{c}}enko-Pastur laws. Figure 3

Figure 3: Eigenvalue distributions of adjusted Kendall's τ\tau demonstrating convergence in probability to theoretical limits.

Statistical and Theoretical Implications

The results have notable implications for high-dimensional statistical analysis, particularly in fields such as finance, genomics, and network analysis, where non-parametric methods are increasingly prevalent due to robustness against outliers and ties.

Moreover, the theoretical findings establish a connection between random matrix theory and rank-based statistics, emphasizing the utility and limits of these dependency measures beyond traditional Gaussian assumptions. The universal nature of limiting distributions irrespective of underlying data distributions underscores this relationship.

Conclusion

The research extends the understanding of rank-based measures in high dimensions, particularly under non-standard data conditions. By providing universal results across different data types, this work enriches both theoretical insights and practical methodologies for modern data analysis, pointing towards future exploration of discrete and heavy-tailed impacts in statistical modeling and inference. Figure 4

Figure 4: Histogram of diagonal entries of scaling matrix demonstrating robustness of the proposed scaling adjustments across sample sizes.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube