2000 character limit reached
On the Saturation Effect of Kernel Ridge Regression
Published 15 May 2024 in stat.ML and cs.LG | (2405.09362v2)
Abstract: The saturation effect refers to the phenomenon that the kernel ridge regression (KRR) fails to achieve the information theoretical lower bound when the smoothness of the underground truth function exceeds certain level. The saturation effect has been widely observed in practices and a saturation lower bound of KRR has been conjectured for decades. In this paper, we provide a proof of this long-standing conjecture.
- Sobolev Spaces. Elsevier, 2003.
- Ingo Steinwart (auth.) Andreas Christmann. Support Vector Machines. Information Science and Statistics. Springer-Verlag New York, New York, NY, first edition, 2008.
- On regularization algorithms in learning theory. Journal of complexity, 23(1):52–72, 2007.
- G. Blanchard and Nicole Mücke. Optimal rates for regularization of statistical inverse learning problems. Foundations of Computational Mathematics, 18:971–1013, 2018.
- A. Caponnetto and Y. Yao. Cross-validation based adaptation for regularization operators in learning theory. Analysis and Applications, 08:161–183, 2010.
- Optimal rates for the regularized least-squares algorithm. Foundations of Computational Mathematics, 7(3):331–368, 2007.
- On the mathematical foundations of learning. Bulletin of the American Mathematical Society, 39(1):1–49, October 2001.
- Approximation Theory and Harmonic Analysis on Spheres and Balls. Springer Monographs in Mathematics. Springer New York, New York, NY, 2013.
- Kernel ridge vs. principal component regression: Minimax bounds and the qualification of regularization operators. Electronic Journal of Statistics, 11:1022–1047, 2017.
- Regularization of Inverse Problems, volume 375. Springer Science & Business Media, 1996.
- Sobolev norm learning rates for regularized least-squares algorithms. Journal of Machine Learning Research, 21:205:1–205:38, 2020.
- Norm inequalities equivalent to Heinz inequality. Proceedings of the American Mathematical Society, 118(3):827–830, 1993.
- Spectral algorithms for supervised learning. Neural Computation, 20(7):1873–1897, 2008.
- László Györfi (ed.). A Distribution-Free Theory of Nonparametric Regression. Springer Series in Statistics. Springer, New York, 2002.
- Global saturation of regularization methods for inverse ill-posed problems. Journal of Optimization Theory and Applications, 148:164–196, 2010.
- Nonparametric regression estimation using penalized least squares. IEEE Transactions on Information Theory, 47(7):3054–3058, 2001.
- Distributed learning for sketched kernel regression. Neural Networks, 143:368–376, November 2021.
- Optimal convergence for distributed learning with stochastic gradient methods and spectral algorithms. Journal of Machine Learning Research, 21:147–1, 2020.
- Optimal rates for spectral algorithms with least-squares regression over Hilbert spaces. Applied and Computational Harmonic Analysis, 48:868–890, 2018.
- Peter Mathé. Saturation of regularization methods for linear ill-posed problems in Hilbert spaces. SIAM journal on numerical analysis, 42(3):968–973, 2004.
- Regularization in kernel learning. The Annals of Statistics, 38(1):526–565, February 2010.
- Stanislav Minsker. On some extensions of Bernstein’s inequality for self-adjoint operators. Statistics & Probability Letters, 127:111–119, April 2017.
- Andreas Neubauer. On converse and saturation results for Tikhonov regularization of linear ill-posed problems. SIAM journal on numerical analysis, 34(2):517–527, 1997.
- Optimal rates for the regularized learning algorithms under general source condition. Frontiers in Applied Mathematics and Statistics, 3, 2017.
- Spectral methods for regularization in learning theory. DISI, Universita degli Studi di Genova, Italy, Technical Report DISI-TR-05-18, 2005.
- Ingo Steinwart and C. Scovel. Mercer’s theorem on general domains: On the interaction between measures, kernels, and RKHSs. Constructive Approximation, 35(3):363–417, 2012.
- Optimal rates for regularized least squares regression. In COLT, pp. 79–93, 2009.
- Joel A. Tropp. User-friendly tools for random matrices: An introduction. Technical report, Defense Technical Information Center, Fort Belvoir, VA, December 2012.
- Alexandre B. Tsybakov. Introduction to Nonparametric Estimation. Springer Series in Statistics. Springer, New York ; London, 1st edition, 2009.
- Roman Vershynin. High-Dimensional Probability: An Introduction with Applications in Data Science, volume 47. Cambridge university press, 2018.
- Martin J. Wainwright. High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2019.
- Holger Wendland. Scattered Data Approximation. Cambridge Monographs on Applied and Computational Mathematics. Cambridge University Press, Cambridge, 2004.
- On early stopping in gradient descent learning. Constructive Approximation, 26(2):289–315, 2007.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.