Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the Approximation of Kernel functions (2403.06731v1)

Published 11 Mar 2024 in stat.ML and cs.LG

Abstract: Various methods in statistical learning build on kernels considered in reproducing kernel Hilbert spaces. In applications, the kernel is often selected based on characteristics of the problem and the data. This kernel is then employed to infer response variables at points, where no explanatory data were observed. The data considered here are located in compact sets in higher dimensions and the paper addresses approximations of the kernel itself. The new approach considers Taylor series approximations of radial kernel functions. For the Gauss kernel on the unit cube, the paper establishes an upper bound of the associated eigenfunctions, which grows only polynomially with respect to the index. The novel approach substantiates smaller regularization parameters than considered in the literature, overall leading to better approximations. This improvement confirms low rank approximation methods such as the Nystr\"om method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. F. Bach. Sharp analysis of low-rank kernel matrix approximations. In S. Shalev-Shwartz and I. Steinwart, editors, Proceedings of the 26th Annual Conference on Learning Theory, volume 30 of Proceedings of Machine Learning Research, pages 185–209, Princeton, NJ, USA, 12–14 Jun 2013. PMLR. URL https://proceedings.mlr.press/v30/Bach13.html.
  2. A. Berlinet and C. Thomas-Agnan. Reproducing Kernel Hilbert Spaces in Probability and Statistics. Springer US, 2004. doi:10.1007/978-1-4419-9096-9.
  3. A. Caponnetto and E. De Vito. Optimal rates for the regularized least-squares algorithm. Foundations of Computational Mathematics, 7(3):331–368, Aug. 2006. ISSN 1615-3383. doi:10.1007/s10208-006-0196-8.
  4. F. Cucker and D. X. Zhou. Learning Theory: An Approximation Theory Viewpoint. Cambridge Monographs on Applied and Computational Mathematics. Cambridge University Press, 2007. doi:10.1017/CBO9780511618796.
  5. Horseshoes in multidimensional scaling and local kernel methods. The Annals of Applied Statistics, 2(3):777 – 807, 2008. doi:10.1214/08-AOAS165.
  6. B. Diederichs and A. Iske. Improved estimates for condition numbers of radial basis function interpolation matrices. Journal of Approximation Theory, 238:38–51, Feb. 2019. ISSN 0021-9045. doi:10.1016/j.jat.2017.10.004.
  7. P. Dommel. A bound on the maximal marginal degrees of freedom, 2024. URL https://arxiv.org/abs/2402.12885.
  8. P. Dommel and A. Pichler. Dynamic programming for data independent decision sets. Journal of Convex Analysis, 2023.
  9. P. Drineas and M. Mahoney. On the nyström method for approximating a gram matrix for improved kernel-based learning. Journal of Machine Learning Research, 6:2153–2175, 12 2005.
  10. W. Feller. An Introduction to Probability Theory and Its Applications, volume 1. Wiley, January 1968. ISBN 0471257087. URL http://www.amazon.ca/exec/obidos/redirect?tag=citeulike04-20{&}path=ASIN/0471257087.
  11. S. Fischer and I. Steinwart. Sobolev norm learning rates for regularized least-squares algorithms. J. Mach. Learn. Res., 21(1), jan 2020. ISSN 1532-4435.
  12. M. Honarkhah and J. Caers. Stochastic simulation of patterns using distance-based pattern modeling. Mathematical Geosciences, 42(5):487–517, Apr. 2010. ISSN 1874-8953. doi:10.1007/s11004-010-9276-7.
  13. Data-driven stochastic dual dynamic programming: Performance guarantees and regularization schemes. 2022. URL https://optimization-online.org/?p=21376.
  14. Less is more: Nyström computational regularization. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 28. Curran Associates, Inc., 2015. URL https://proceedings.neurips.cc/paper_files/paper/2015/file/03e0704b5690a2dee1861dc3ad3316c9-Paper.pdf.
  15. Falkon: An optimal large scale kernel method. In Advances in Neural Information Processing Systems, 2017. URL https://api.semanticscholar.org/CorpusID:25900554.
  16. B. Schölkopf. Support Vector Learning. 1997. URL https://pure.mpg.de/rest/items/item_1794215/component/file_3214422/content.
  17. Kernel Methods in Computational Biology. The MIT Press, July 2004. ISBN 9780262256926. doi:10.7551/mitpress/4057.001.0001.
  18. On the Eigenspectrum of the Gram Matrix and Its Relationship to the Operator Eigenspectrum, pages 12–12. Springer Berlin Heidelberg, 2002. ISBN 9783540361824. doi:10.1007/3-540-36182-0_3.
  19. Data spectroscopy: Eigenspaces of convolution operators and clustering. The Annals of Statistics, 37(6B):3960 – 3984, 2009. doi:10.1214/09-AOS700. URL https://doi.org/10.1214/09-AOS700.
  20. I. Steinwart and A. Christmann. Support vector machines. Springer Science & Business Media, 2008. doi:https://doi.org/10.1007/978-0-387-77242-4.
  21. Optimal rates for regularized least squares regression. In Proceedings of the 22nd Annual Conference on Learning Theory, pages 79–93, 2009.
  22. N. Sterge and B. K. Sriperumbudur. Statistical optimality and computational efficiency of nyström kernel pca. Journal of Machine Learning Research, 23:337:1–337:32, 2021. URL https://api.semanticscholar.org/CorpusID:234777746.
  23. C. Williams and M. Seeger. Using the nyström method to speed up kernel machines. In T. Leen, T. Dietterich, and V. Tresp, editors, Advances in Neural Information Processing Systems, volume 13. MIT Press, 2000. URL https://proceedings.neurips.cc/paper_files/paper/2000/file/19de10adbaa1b2ee13f77f679fa1483a-Paper.pdf.
  24. Local features and kernels for classification of texture and object categories: A comprehensive study. International Journal of Computer Vision, 73(2):213–238, June 2007. ISSN 0920-5691. doi:10.1007/s11263-006-9794-4.
  25. Divide and conquer kernel ridge regression. In S. Shalev-Shwartz and I. Steinwart, editors, Proceedings of the 26th Annual Conference on Learning Theory, volume 30 of Proceedings of Machine Learning Research, pages 592–617, Princeton, NJ, USA, 12–14 Jun 2013. PMLR. URL https://proceedings.mlr.press/v30/Zhang13.html.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Paul Dommel (5 papers)
  2. Alois Pichler (34 papers)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com