Efficient approximation of optimal neural network parameters by stochastic gradient methods
Determine conditions under which the optimal parameters of a neural network can be efficiently approximated by conventional algorithms such as stochastic gradient descent, despite the non-convexity of the training objective ensured only by universal representation theorems.
References
Furthermore while universal representation theorems ensures the existence of the optimal parameters of the network, it is in general not known when such optimal parameters can be efficiently approximated by conventional algorithms, such as stochastic gradient descent.
— Mean-Field Langevin Dynamics and Energy Landscape of Neural Networks
(1905.07769 - Hu et al., 2019) in Section 1 (Introduction), p. 2