Optimization landscape of two-layer neural networks (population risk)
Characterize the optimization landscape of the population risk R_N(θ) for two-layer neural networks with prediction ŷ(x;θ) = (1/N) ∑_{i=1}^N σ_*(x; θ_i) and square loss ℓ(y,ŷ) = (y − ŷ)^2, including the existence and structure of local minima, saddle points, and global minima, even when an infinite number of training examples are available.
Sponsor
References
Understanding the optimization landscape of two-layers neural networks is largely an open problem even when we have access to an infinite number of examples, i.e. to the population risk R_{N}(\theta).
— A Mean Field View of the Landscape of Two-Layers Neural Networks
(1804.06561 - Mei et al., 2018) in Introduction