Penalized Principal Component Analysis for Large-dimension Factor Model with Group Pursuit (2407.19378v3)
Abstract: This paper investigates the intrinsic group structures within the framework of large-dimensional approximate factor models, which portrays homogeneous effects of the common factors on the individuals that fall into the same group. To this end, we propose a fusion Penalized Principal Component Analysis (PPCA) method and derive a closed-form solution for the $\ell_2$-norm optimization problem. We also show the asymptotic properties of our proposed PPCA estimates. With the PPCA estimates as an initialization, we identify the unknown group structure by a combination of the agglomerative hierarchical clustering algorithm and an information criterion. Then the factor loadings and factor scores are re-estimated conditional on the identified latent groups. Under some regularity conditions, we establish the consistency of the membership estimators as well as that of the group number estimator derived from the information criterion. Theoretically, we show that the post-clustering estimators for the factor loadings and factor scores with group pursuit achieve efficiency gains compared to the estimators by conventional PCA method. Thorough numerical studies validate the established theory and a real financial example illustrates the practical usefulness of the proposed method.