Generalization error theory for infinite-width nonlinear networks in the mean-field regime
Develop a rigorous theory that characterizes the generalization error of infinite-width nonlinear neural networks operating in the mean-field regime when trained on finite datasets, for example two-layer ReLU networks with Gaussian inputs trained by gradient flow or gradient descent. Specifically, derive expressions for the expected test error as a function of sample size and relevant model/task parameters so that transferability in this nonlinear setting can be analyzed analytically rather than empirically.
References
However, an expression for the generalization error of the scratch-trained model is also needed to derive the transferability. We are not aware of a theory of generalization error for infinite width nonlinear networks trained on a finite data in the mean field regime.