On the Non-Asymptotic Concentration of Heteroskedastic Wishart-type Matrix (2008.12434v2)
Abstract: This paper focuses on the non-asymptotic concentration of the heteroskedastic Wishart-type matrices. Suppose $Z$ is a $p_1$-by-$p_2$ random matrix and $Z_{ij} \sim N(0,\sigma_{ij}2)$ independently, we prove the expected spectral norm of Wishart matrix deviations (i.e., $\mathbb{E} \left|ZZ\top - \mathbb{E} ZZ\top\right|$) is upper bounded by \begin{equation*} \begin{split} (1+\epsilon)\left{2\sigma_C\sigma_R + \sigma_C2 + C\sigma_R\sigma_\sqrt{\log(p_1 \wedge p_2)} + C\sigma_2\log(p_1 \wedge p_2)\right}, \end{split} \end{equation*} where $\sigma_C2 := \max_j \sum_{i=1}{p_1}\sigma_{ij}2$, $\sigma_R2 := \max_i \sum_{j=1}{p_2}\sigma_{ij}2$ and $\sigma_*2 := \max_{i,j}\sigma_{ij}2$. A minimax lower bound is developed that matches this upper bound. Then, we derive the concentration inequalities, moments, and tail bounds for the heteroskedastic Wishart-type matrix under more general distributions, such as sub-Gaussian and heavy-tailed distributions. Next, we consider the cases where $Z$ has homoskedastic columns or rows (i.e., $\sigma_{ij} \approx \sigma_i$ or $\sigma_{ij} \approx \sigma_j$) and derive the rate-optimal Wishart-type concentration bounds. Finally, we apply the developed tools to identify the sharp signal-to-noise ratio threshold for consistent clustering in the heteroskedastic clustering problem.