Spectral statistics of large dimensional Spearman's rank correlation matrix and its application (1312.5119v3)
Abstract: Let $\mathbf{Q}=(Q_1,\ldots,Q_n)$ be a random vector drawn from the uniform distribution on the set of all $n!$ permutations of ${1,2,\ldots,n}$. Let $\mathbf{Z}=(Z_1,\ldots,Z_n)$, where $Z_j$ is the mean zero variance one random variable obtained by centralizing and normalizing $Q_j$, $j=1,\ldots,n$. Assume that $\mathbf {X}i,i=1,\ldots ,p$ are i.i.d. copies of $\frac{1}{\sqrt{p}}\mathbf{Z}$ and $X=X{p,n}$ is the $p\times n$ random matrix with $\mathbf{X}_i$ as its $i$th row. Then $S_n=XX*$ is called the $p\times n$ Spearman's rank correlation matrix which can be regarded as a high dimensional extension of the classical nonparametric statistic Spearman's rank correlation coefficient between two independent random variables. In this paper, we establish a CLT for the linear spectral statistics of this nonparametric random matrix model in the scenario of high dimension, namely, $p=p(n)$ and $p/n\to c\in(0,\infty)$ as $n\to\infty$. We propose a novel evaluation scheme to estimate the core quantity in Anderson and Zeitouni's cumulant method in [Ann. Statist. 36 (2008) 2553-2576] to bypass the so-called joint cumulant summability. In addition, we raise a two-step comparison approach to obtain the explicit formulae for the mean and covariance functions in the CLT. Relying on this CLT, we then construct a distribution-free statistic to test complete independence for components of random vectors. Owing to the nonparametric property, we can use this test on generally distributed random variables including the heavy-tailed ones.