Embedding Inequalities for Barron-type Spaces (2305.19082v3)
Abstract: An important problem in machine learning theory is to understand the approximation and generalization properties of two-layer neural networks in high dimensions. To this end, researchers have introduced the Barron space $\mathcal{B}s(\Omega)$ and the spectral Barron space $\mathcal{F}_s(\Omega)$, where the index $s\in [0,\infty)$ indicates the smoothness of functions within these spaces and $\Omega\subset\mathbb{R}d$ denotes the input domain. However, the precise relationship between the two types of Barron spaces remains unclear. In this paper, we establish a continuous embedding between them as implied by the following inequality: for any $\delta\in (0,1), s\in \mathbb{N}{+}$ and $f: \Omega \mapsto\mathbb{R}$, it holds that [ \delta |f|{\mathcal{F}{s-\delta}(\Omega)}\lesssim_s |f|{\mathcal{B}s(\Omega)}\lesssim_s |f|{\mathcal{F}_{s+1}(\Omega)}. ] Importantly, the constants do not depend on the input dimension $d$, suggesting that the embedding is effective in high dimensions. Moreover, we also show that the lower and upper bound are both tight.
- Andrew R. Barron. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information theory, 39(3):930–945, 1993.
- Leo Breiman. Hinging hyperplanes for regression, classification, and function approximation. IEEE Transactions on Information Theory, 39(3):999–1013, 1993.
- On the representation of solutions to elliptic PDEs in Barron spaces. Advances in Neural Information Processing Systems, 34, 2021.
- Neural network approximation and estimation of classifiers with classification boundary in a Barron class. arXiv preprint arXiv:2011.09363, 2020.
- George Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of control, signals and systems, 2(4):303–314, 1989.
- Tighter sparse approximation bounds for ReLU neural networks. In International Conference on Learning Representations, 2022.
- A priori estimates of the population risk for two-layer neural networks. Communications in Mathematical Sciences, 17(5):1407–1425, 2019.
- The Barron space and the flow-induced function spaces for neural network models. Constructive Approximation, pages 1–38, 2021.
- Representation formulas and pointwise properties for barron functions. Calculus of Variations and Partial Differential Equations, 61(2):46, 2022.
- Some observations on high-dimensional partial differential equations with Barron data. In Mathematical and Scientific Machine Learning, pages 253–269. PMLR, 2022.
- The deep Ritz method: A deep learning-based numerical algorithm for solving variational problems. Communications in Mathematics and Statistics, 6(1):1–12, 2018.
- Risk bounds for high-dimensional ridge function combinations including neural networks. arXiv preprint arXiv:1607.01434, 2016.
- A priori generalization error analysis of two-layer neural networks for solving high dimensional Schrödinger eigenvalue problems. Communications of the American Mathematical Society, 2(1):1–21, 2022.
- A priori generalization analysis of the deep Ritz method for solving high dimensional elliptic partial differential equations. In Conference on learning theory, pages 3196–3241. PMLR, 2021.
- Complexity measures for neural networks with general activation functions using path-based norms. arXiv preprint arXiv:2009.06132, 2020.
- Better approximations of high dimensional smooth functions by deep neural networks with rectified power units. Communications in Computational Physics, 27(2):379–411, 2019.
- A new function space from Barron class and application to neural network approximation. Communications in Computational Physics, 32(5):1361–1400, 2022.
- Norm-based capacity control in neural networks. In Conference on learning theory, pages 1376–1401. PMLR, 2015.
- A function space view of bounded norm infinite width relu nets: The multivariate case. In International Conference on Learning Representations, 2019.
- Allan Pinkus. Approximation theory of the mlp model in neural networks. Acta numerica, 8:143–195, 1999.
- Banach space representer theorems for neural networks and ridge splines. Journal of Machine Learning Research, 22(1):1960–1999, 2021.
- Searching for efficient transformers for language modeling. In Advances in Neural Information Processing Systems, volume 34, pages 6010–6022, 2021.
- Approximation rates for neural networks with general activation functions. Neural Networks, 128:313–321, 2020.
- Sharp bounds on the approximation rates, metric entropy, and n-widths of shallow neural networks. Foundations of Computational Mathematics, pages 1–57, 2022.
- Characterization of the variation spaces corresponding to shallow neural networks. Constructive Approximation, pages 1–24, 2023.
- Jinchao Xu. Finite neuron method and convergence analysis. Communications in Computational Physics, 28(5):1707–1745, 2020.