Spectral Algorithms on Manifolds through Diffusion (2403.03669v2)
Abstract: The existing research on spectral algorithms, applied within a Reproducing Kernel Hilbert Space (RKHS), has primarily focused on general kernel functions, often neglecting the inherent structure of the input feature space. Our paper introduces a new perspective, asserting that input data are situated within a low-dimensional manifold embedded in a higher-dimensional Euclidean space. We study the convergence performance of spectral algorithms in the RKHSs, specifically those generated by the heat kernels, known as diffusion spaces. Incorporating the manifold structure of the input, we employ integral operator techniques to derive tight convergence upper bounds concerning generalized norms, which indicates that the estimators converge to the target function in strong sense, entailing the simultaneous convergence of the function itself and its derivatives. These bounds offer two significant advantages: firstly, they are exclusively contingent on the intrinsic dimension of the input manifolds, thereby providing a more focused analysis. Secondly, they enable the efficient derivation of convergence rates for derivatives of any k-th order, all of which can be accomplished within the ambit of the same spectral algorithms. Furthermore, we establish minimax lower bounds to demonstrate the asymptotic optimality of these conclusions in specific contexts. Our study confirms that the spectral algorithms are practically significant in the broader context of high-dimensional approximation.
- On regularization algorithms in learning theory. Journal of complexity, 23(1):52–72, 2007.
- Laplacian eigenmaps for dimensionality reduction and data representation. Neural computation, 15(6):1373–1396, 2003.
- Optimal rates for regularization of statistical inverse learning problems. Foundations of Computational Mathematics, 18:971–1013, 2018.
- Optimal rates for the regularized least-squares algorithm. Foundations of Computational Mathematics, 7:331–368, 2007.
- Analyzing the discrepancy principle for kernelized spectral filter learning algorithms. The Journal of Machine Learning Research, 22(1):3498–3556, 2021.
- Isaac Chavel. Eigenvalues in Riemannian geometry. Academic press, 1984.
- Bessel bridge representation for the heat kernel in hyperbolic space. Proceedings of the American Mathematical Society, 146(4):1781–1792, 2018.
- Diffusion maps. Applied and computational harmonic analysis, 21(1):5–30, 2006.
- Learning theory: an approximation theory viewpoint, volume 24. Cambridge University Press, 2007.
- Reproducing kernel hilbert spaces on manifolds: Sobolev and diffusion spaces. Analysis and Applications, 19(03):363–396, 2021.
- Kernel ridge vs. principal component regression: Minimax bounds and the qualification of regularization operators. Electronic Journal of Statistics, 11:1022–1047, 2017.
- Manfredo P Do Carmo. Differential geometry of curves and surfaces: revised and updated second edition. Courier Dover Publications, 2016.
- Sobolev norm learning rates for regularized least-squares algorithms. The Journal of Machine Learning Research, 21(1):8464–8501, 2020.
- Spectral algorithms for supervised learning. Neural Computation, 20(7):1873–1897, 2008.
- Compressed gaussian process for manifold regression. The Journal of Machine Learning Research, 17(1):2472–2497, 2016.
- Learning theory of distributed spectral algorithms. Inverse Problems, 33(7):074009, 2017.
- Optimal rates for coefficient-based regularized regression. Applied and Computational Harmonic Analysis, 47(3):662–701, 2019.
- Adaptive learning rates for support vector machines working on data with low intrinsic dimension. The Annals of Statistics, 49(6):3153–3180, 2021.
- Elton P Hsu. Stochastic analysis on manifolds. Number 38. American Mathematical Soc., 2002.
- John M Lee. Introduction to Riemannian manifolds, volume 2. Springer, 2018.
- On the schrödinger equation and the eigenvalue problem. Communications in Mathematical Physics, 88(3):309–318, 1983.
- Optimal convergence for distributed learning with stochastic gradient methods and spectral algorithms. The Journal of Machine Learning Research, 21(1):5852–5914, 2020.
- Distributed kernel-based gradient descent algorithms. Constructive Approximation, 47(2):249–276, 2018.
- Statistical optimality of divide and conquer kernel-based functional linear regression. arXiv preprint arXiv:2211.10968, 2022.
- Sample complexity and effective dimension for regression on manifolds. Advances in Neural Information Processing Systems, 33:12993–13004, 2020.
- Parallelizing spectrally regularized kernel algorithms. The Journal of Machine Learning Research, 19(30):1–29, 2018.
- Statistical optimality of stochastic gradient descent on hard learning problems through multiple passes. Advances in Neural Information Processing Systems, 31, 2018.
- Potluri Rao. Some notes on misspecification in multiple regressions. The American Statistician, 25(5):37–39, 1971.
- Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323–2326, 2000.
- Lei Shi. Distributed learning with indefinite kernels. Analysis and Applications, 17(06):947–975, 2019.
- Learning theory estimates via integral operators and their approximations. Constructive approximation, 26(2):153–172, 2007.
- Support vector machines. Springer Science & Business Media, 2008.
- Mercer’s theorem on general domains: On the interaction between measures, kernels, and rkhss. Constructive Approximation, 35:363–417, 2012.
- Hans Triebel. Theory of Function Spaces II. Modern Birkhäuser Classics. Birkhäuser Basel, 1992.
- Alexandre B. Tsybakov. Introduction to Nonparametric Estimation. Springer series in statistics. Springer, 2009.
- Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. The Journal of machine learning research, 9(11), 2008.
- Martin J Wainwright. High-dimensional statistics: A non-asymptotic viewpoint, volume 48. Cambridge university press, 2019.
- David C Wood. The computation of polylogarithms. Technical report. University of Kent, Canterbury, UK, 1992.
- On early stopping in gradient descent learning. Constructive Approximation, 26(2):289–315, 2007.
- On the optimality of misspecified spectral algorithms. arXiv preprint arXiv:2303.14942, 2023.