Knowledge Transfer across Multiple Principal Component Analysis Studies (2403.07431v1)
Abstract: Transfer learning has aroused great interest in the statistical community. In this article, we focus on knowledge transfer for unsupervised learning tasks in contrast to the supervised learning tasks in the literature. Given the transferable source populations, we propose a two-step transfer learning algorithm to extract useful information from multiple source principal component analysis (PCA) studies, thereby enhancing estimation accuracy for the target PCA task. In the first step, we integrate the shared subspace information across multiple studies by a proposed method named as Grassmannian barycenter, instead of directly performing PCA on the pooled dataset. The proposed Grassmannian barycenter method enjoys robustness and computational advantages in more general cases. Then the resulting estimator for the shared subspace from the first step is further utilized to estimate the target private subspace in the second step. Our theoretical analysis credits the gain of knowledge transfer between PCA studies to the enlarged eigenvalue gap, which is different from the existing supervised transfer learning tasks where sparsity plays the central role. In addition, we prove that the bilinear forms of the empirical spectral projectors have asymptotic normality under weaker eigenvalue gap conditions after knowledge transfer. When the set of informativesources is unknown, we endow our algorithm with the capability of useful dataset selection by solving a rectified optimization problem on the Grassmann manifold, which in turn leads to a computationally friendly rectified Grassmannian K-means procedure. In the end, extensive numerical simulation results and a real data case concerning activity recognition are reported to support our theoretical claims and to illustrate the empirical usefulness of the proposed transfer learning methods.
- Spectral analysis of large dimensional random matrices, volume 20. Springer.
- Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices.
- Statistical inference for principal components of spiked covariance matrices. The Annals of Statistics, 50(2):1144–1169.
- Recognizing daily and sports activities in two open source machine learning environments using body-worn sensor units. The Computer Journal, 57(11):1649–1667.
- Bastani, H. (2021). Predicting with proxies: Transfer learning in high dimension. Management Science, 67(5):2964–2984.
- A grassmann manifold handbook: Basic geometry and computational aspects. arXiv preprint arXiv:2011.13699.
- Large sample theory of intrinsic and extrinsic sample means on manifolds. The Annals of Statistics, 31(1):1–29.
- Large sample theory of intrinsic and extrinsic sample means on manifolds—ii. The Annals of Statistics, 33(3):1225–1259.
- Bosq, D. (2000). Stochastic processes and random variables in function spaces. Linear Processes in Function Spaces: Theory and Applications, pages 15–42.
- Limiting laws for divergent spiked eigenvalues and largest nonspiked eigenvalue of sample covariance matrices.
- Sparse pca: Optimal rates and adaptive estimation.
- Transfer learning for nonparametric classification: Minimax rate and adaptive classifier. The Annals of Statistics, 49(1).
- Statistical inference for high-dimensional matrix-variate factor models. Journal of the American Statistical Association, pages 1–18.
- Statistical inference for high-dimensional matrix-variate factor model.
- A direct formulation for sparse pca using semidefinite programming. Advances in neural information processing systems, 17.
- Target pca: Transfer learning large dimensional panel data. Journal of Econometrics, page 105521.
- A smeary central limit theorem for manifolds with application to high-dimensional spheres. The Annals of Statistics, 47(6):3360–3381.
- Large covariance estimation by thresholding principal orthogonal complements. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75(4):603–680.
- Large covariance estimation through elliptical factor models. The Annals of Statistics: An Official Journal of the Institute of Mathematical Statistics, 46(4):1383–1414.
- Distributed estimation of principal eigenspaces. Annals of statistics, 47(6):3009.
- Eca: High-dimensional elliptical component analysis in non-gaussian distributions. Journal of the American Statistical Association, 113(521):252–268.
- Large-dimensional factor analysis without moment constraints. Journal of Business & Economic Statistics, 40(1):302–312.
- Distributed learning for principle eigenspaces without moment constraints. arXiv preprint arXiv:2204.14049.
- Newton’s method on grassmann manifolds. arXiv preprint arXiv:0709.2205.
- Matrix analysis. Cambridge university press.
- Optimal parameter-transfer learning by semiparametric model averaging. Journal of Machine Learning Research, 24(358):1–53.
- Finite sample smeariness of fréchet means and application to climate. arXiv preprint arXiv:2005.02321.
- Newton-like methods for numerical optimization on manifolds. In Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004., volume 1, pages 136–139. IEEE.
- On consistency and sparsity for principal components analysis in high dimensions. Journal of the American Statistical Association, 104(486):682–693.
- A modified principal component technique based on the lasso. Journal of computational and Graphical Statistics, 12(3):531–547.
- Asymptotics and concentration bounds for spectral projectors of sample covariance. arXiv preprint arXiv:1408.4643.
- Asymptotics and concentration bounds for bilinear forms of spectral projectors of sample covariance.
- Concentration inequalities and moment bounds for sample covariance operators. Bernoulli, pages 110–133.
- New asymptotic results in principal component analysis. Sankhya A, 79:254–297.
- Sparsistency and agnostic inference in sparse pca.
- Transfer learning for high-dimensional linear regression: Prediction, estimation and minimax optimality. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(1):149–173.
- Estimation and inference for high-dimensional generalized linear models with knowledge transfer. Journal of the American Statistical Association, pages 1–12.
- Simultaneous cluster structure learning and estimation of heterogeneous graphs for matrix-variate fmri data. Biometrics.
- Bootstrap confidence sets for spectral projectors of sample covariance. Probability Theory and Related Fields, 174(3):1091–1132.
- A decade survey of transfer learning (2010–2020). IEEE Transactions on Artificial Intelligence, 1(2):151–166.
- Cluster analysis: Unsupervised learning via supervised learning with a non-convex penalty. Journal of Machine Learning Research, 14(7):1865–1889.
- Pearson, K. (1901). Liii. on lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin philosophical magazine and journal of science, 2(11):559–572.
- Transfer learning for high-dimensional quantile regression with statistical guarantee. Available upon request.
- Adaptive transfer learning. The Annals of Statistics, 49(6):3618–3649.
- Likelihood-based selection and sharp parameter estimation. Journal of the American Statistical Association, 107(497):223–232.
- Hypothesis testing for eigenspaces of covariance matrix. arXiv preprint arXiv:2002.09810.
- Transfer learning under high-dimensional generalized linear models. Journal of the American Statistical Association, pages 1–14.
- Transfer learning. In Handbook of research on machine learning applications and trends: algorithms, methods, and techniques, pages 242–264. IGI global.
- Vershynin, R. (2018). High-dimensional probability: An introduction with applications in data science, volume 47. Cambridge university press.
- Stratified transfer learning for cross-domain activity recognition. In 2018 IEEE international conference on pervasive computing and communications (PerCom), pages 1–10. IEEE.
- Asymptotics of empirical eigenstructure for high dimensional spiked covariance. Annals of statistics, 45(3):1342.
- A new algorithm and theory for penalized regression-based clustering. The Journal of Machine Learning Research, 17(1):6479–6503.
- Xia, D. (2021). Normal approximation and confidence region of singular subspaces. Electronic Journal of Statistics, 15(2):3798–3851.
- Multitask principal component analysis. In Asian Conference on Machine Learning, pages 302–317. PMLR.
- Projected estimation for large-dimensional matrix factor models. Journal of Econometrics, 229(1):201–217.
- (2d)2pca: Two-directional two-dimensional pca for efficient face representation and recognition. Neurocomputing, 69(1-3):224–231.
- A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering, 34(12):5586–5609.
- A comprehensive survey on transfer learning. Proceedings of the IEEE, 109(1):43–76.
- Sparse principal component analysis. Journal of computational and graphical statistics, 15(2):265–286.
- A selective overview of sparse principal component analysis. Proceedings of the IEEE, 106(8):1311–1320.