Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Neural reproducing kernel Banach spaces and representer theorems for deep networks (2403.08750v1)

Published 13 Mar 2024 in stat.ML, cs.LG, and math.FA

Abstract: Studying the function spaces defined by neural networks helps to understand the corresponding learning models and their inductive bias. While in some limits neural networks correspond to function spaces that are reproducing kernel Hilbert spaces, these regimes do not capture the properties of the networks used in practice. In contrast, in this paper we show that deep neural networks define suitable reproducing kernel Banach spaces. These spaces are equipped with norms that enforce a form of sparsity, enabling them to adapt to potential latent structures within the input data and their representations. In particular, leveraging the theory of reproducing kernel Banach spaces, combined with variational results, we derive representer theorems that justify the finite architectures commonly employed in applications. Our study extends analogous results for shallow networks and can be seen as a step towards considering more practically plausible neural architectures.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. N. Aronszajn “Theory of Reproducing Kernels” In Transactions of the American Mathematical Society 68.3, 1950, pp. 337–404
  2. Francis Bach “Breaking the Curse of Dimensionality with Convex Neural Networks” In Journal of Machine Learning Research 18.19, 2017, pp. 1–53
  3. “Understanding neural networks with reproducing kernel Banach spaces” In Applied and Computational Harmonic Analysis 62, 2023, pp. 194–236
  4. “Deep equals shallow for ReLu networks in kernel regimes” In International Conference on Learning Representations (ICLR) 9, 2021
  5. “On the Inductive Bias of Neural Tangent Kernels” In Advances in Neural Information Processing Systems (NeurIPS) 32, 2019
  6. V.I. Bogachev “Measure theory” Springer-Verlag, 2007
  7. “Sparsity of solutions for variational inverse problems with finite-dimensional data” In Calculus of Variations and Partial Differential Equations 59.14, 2020
  8. “Vector valued reproducing kernel Hilbert spaces and universality” In Analysis and Applications 8.01, 2010, pp. 19–61
  9. L. Chizat, E. Oyallon and F. Bach “On lazy training in differentiable programming” In Advances in Neural Information Processing Systems (NeurIPS) 32, 2019
  10. “Vector Measures” American Mathematical Society, 1977
  11. “When Do Neural Networks Outperform Kernel Methods?” In Advances in Neural Information Processing Systems (NeurIPS) 33, 2020, pp. 14820–14830
  12. Boris Hanin “Random Neural Networks in the Infinite Width Limit as Gaussian Processes” In Annals of Applied Probability to appear, 2023
  13. Arthur Jacot, Clément Hongler and Franck Gabriel “Neural Tangent Kernel: Convergence and Generalization in Neural Networks” In Advances in Neural Information Processing Systems (NeurIPS), 2018, pp. 8580–8589
  14. Rong Rong Lin, Hai Zhang Zhang and Jun Zhang “On Reproducing Kernel Banach Spaces: Generic Definitions and Unified Framework of Constructions” In Acta Mathematica Sinica 38.8, 2022, pp. 1459–1483
  15. Radford M Neal “Bayesian Learning for Neural Networks” Springer, 2012
  16. “A Function Space View of Bounded Norm Infinite Width ReLU Nets: The Multivariate Case” In Eighth International Conference on Learning Representations (ICLR), 2020
  17. Rahul Parhi and Robert D. Nowak “Banach Space Representer Theorems for Neural Networks and Ridge Splines” In Journal of Machine Learning Research 22.43, 2021, pp. 1–40
  18. Rahul Parhi and Robert D. Nowak “What Kinds of Functions Do Deep Neural Networks Learn? Insights from Variational Spline Theory” In SIAM Journal on Mathematics of Data Science 4.2, 2022, pp. 464–489
  19. “Random Features for Large-Scale Kernel Machines” In Advances in Neural Information Processing Systems (NeurIPS) 20, 2007
  20. Walter Rudin “Functional Analysis”, International Series in Pure and Applied Mathematics McGraw-Hill, New York, 1991
  21. Robert Ryan “The F. and M. Riesz theorem for vector measures” In Indag. Math. 25, 1963, pp. 408–412
  22. “How do infinite width bounded norm networks look in function space?” In Conference on Learning Theory, 2019, pp. 2667–2690 PMLR
  23. I Singer “Linear functionals on the space of continuous mappings of a compact Hausdorff space into a Banach spaces” In Rev. Math. Pures Appl. 2, 1957, pp. 301–315
  24. Michael Unser “A Unifying Representer Theorem for Inverse Problems and Machine Learning” In Foundations of Computational Mathematics Springer, 2020, pp. 1–20
  25. Michael Unser “From kernel methods to neural networks: A unifying variational formulation” In Foundations of Computational Mathematics Springer, 2023, pp. 1–40
  26. Michael Unser “Ridges, Neural Networks, and the Radon Transform” In Journal of Machine Learning Research 24.37, 2023, pp. 1–33
  27. “Native Banach spaces for splines and variational inverse problems” In arXiv:1904.10818, 2019
  28. Michael Unser, Julien Fageot and John Paul Ward “Splines are universal solutions of linear inverse problems with generalized TV regularization” In SIAM Review 59.4 SIAM, 2017, pp. 769–793
  29. Dirk Werner “Extreme points in spaces of operators and vector–valued measures” In Proceedings of the 12th Winter School on Abstract Analysis Circolo Matematico di Palermo, 1984, pp. 135–143
  30. Haizhang Zhang, Yuesheng Xu and Jun Zhang “Reproducing Kernel Banach Spaces for Machine Learning” In Journal of Machine Learning Research 10.95, 2009, pp. 2741–2775
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com