Dice Question Streamline Icon: https://streamlinehq.com

Relaxing the eigengap assumption in the geometric analysis of PCA

Determine whether the asymptotic and non-asymptotic characterizations of PCA under the reconstruction loss can be extended to the degenerate case without an eigengap at rank k (i.e., when λk = λk+1), where the set of minimizers forms a Grassmannian submanifold and standard asymptotic M-estimation theory does not directly apply.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper’s main results—central limit theorems for the PCA estimator and its excess risk, and matching non-asymptotic upper bounds—are derived under the eigengap condition λk > λk+1. This condition ensures that the population risk has an isolated minimizer, enabling classical asymptotic techniques for M-estimators on the Grassmannian.

When the eigengap vanishes (λk = λk+1), the set of minimizers is no longer a singleton but a submanifold of the Grassmannian (see the characterization in equation (general_minimizers)), creating a degeneracy where standard asymptotic methods are not directly applicable. Extending the analysis to this setting would remove a key limitation and broaden the applicability of the theory.

References

The first is that they rely on the eigengap condition $\lambda_{k+1} - \lambda_{k} > 0$. While this is a mild assumption, it would be desirable to relax it, though this is quite challenging with our approach. To see why, note that without it, the minimizers of the reconstruction risk form a submanifold (itself a Grassmannian) of $\Gr(d, k)$ (see (\ref{eq:general_minimizers})). The classical theory of asymptotic statistics, upon which our results rely, does not immediately apply in such a degenerate setting , and we leave this problem to future work.

A Geometric Analysis of PCA (2510.20978 - Hanchi et al., 23 Oct 2025) in Section 6 (Discussion)