Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Embedding Inequalities for Barron-type Spaces (2305.19082v3)

Published 30 May 2023 in stat.ML, cs.LG, cs.NA, and math.NA

Abstract: An important problem in machine learning theory is to understand the approximation and generalization properties of two-layer neural networks in high dimensions. To this end, researchers have introduced the Barron space $\mathcal{B}s(\Omega)$ and the spectral Barron space $\mathcal{F}_s(\Omega)$, where the index $s\in [0,\infty)$ indicates the smoothness of functions within these spaces and $\Omega\subset\mathbb{R}d$ denotes the input domain. However, the precise relationship between the two types of Barron spaces remains unclear. In this paper, we establish a continuous embedding between them as implied by the following inequality: for any $\delta\in (0,1), s\in \mathbb{N}{+}$ and $f: \Omega \mapsto\mathbb{R}$, it holds that [ \delta |f|{\mathcal{F}{s-\delta}(\Omega)}\lesssim_s |f|{\mathcal{B}s(\Omega)}\lesssim_s |f|{\mathcal{F}_{s+1}(\Omega)}. ] Importantly, the constants do not depend on the input dimension $d$, suggesting that the embedding is effective in high dimensions. Moreover, we also show that the lower and upper bound are both tight.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. Andrew R. Barron. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information theory, 39(3):930–945, 1993.
  2. Leo Breiman. Hinging hyperplanes for regression, classification, and function approximation. IEEE Transactions on Information Theory, 39(3):999–1013, 1993.
  3. On the representation of solutions to elliptic PDEs in Barron spaces. Advances in Neural Information Processing Systems, 34, 2021.
  4. Neural network approximation and estimation of classifiers with classification boundary in a Barron class. arXiv preprint arXiv:2011.09363, 2020.
  5. George Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of control, signals and systems, 2(4):303–314, 1989.
  6. Tighter sparse approximation bounds for ReLU neural networks. In International Conference on Learning Representations, 2022.
  7. A priori estimates of the population risk for two-layer neural networks. Communications in Mathematical Sciences, 17(5):1407–1425, 2019.
  8. The Barron space and the flow-induced function spaces for neural network models. Constructive Approximation, pages 1–38, 2021.
  9. Representation formulas and pointwise properties for barron functions. Calculus of Variations and Partial Differential Equations, 61(2):46, 2022.
  10. Some observations on high-dimensional partial differential equations with Barron data. In Mathematical and Scientific Machine Learning, pages 253–269. PMLR, 2022.
  11. The deep Ritz method: A deep learning-based numerical algorithm for solving variational problems. Communications in Mathematics and Statistics, 6(1):1–12, 2018.
  12. Risk bounds for high-dimensional ridge function combinations including neural networks. arXiv preprint arXiv:1607.01434, 2016.
  13. A priori generalization error analysis of two-layer neural networks for solving high dimensional Schrödinger eigenvalue problems. Communications of the American Mathematical Society, 2(1):1–21, 2022.
  14. A priori generalization analysis of the deep Ritz method for solving high dimensional elliptic partial differential equations. In Conference on learning theory, pages 3196–3241. PMLR, 2021.
  15. Complexity measures for neural networks with general activation functions using path-based norms. arXiv preprint arXiv:2009.06132, 2020.
  16. Better approximations of high dimensional smooth functions by deep neural networks with rectified power units. Communications in Computational Physics, 27(2):379–411, 2019.
  17. A new function space from Barron class and application to neural network approximation. Communications in Computational Physics, 32(5):1361–1400, 2022.
  18. Norm-based capacity control in neural networks. In Conference on learning theory, pages 1376–1401. PMLR, 2015.
  19. A function space view of bounded norm infinite width relu nets: The multivariate case. In International Conference on Learning Representations, 2019.
  20. Allan Pinkus. Approximation theory of the mlp model in neural networks. Acta numerica, 8:143–195, 1999.
  21. Banach space representer theorems for neural networks and ridge splines. Journal of Machine Learning Research, 22(1):1960–1999, 2021.
  22. Searching for efficient transformers for language modeling. In Advances in Neural Information Processing Systems, volume 34, pages 6010–6022, 2021.
  23. Approximation rates for neural networks with general activation functions. Neural Networks, 128:313–321, 2020.
  24. Sharp bounds on the approximation rates, metric entropy, and n-widths of shallow neural networks. Foundations of Computational Mathematics, pages 1–57, 2022.
  25. Characterization of the variation spaces corresponding to shallow neural networks. Constructive Approximation, pages 1–24, 2023.
  26. Jinchao Xu. Finite neuron method and convergence analysis. Communications in Computational Physics, 28(5):1707–1745, 2020.

Summary

We haven't generated a summary for this paper yet.