On the loss landscape of a class of deep neural networks with no bad local valleys

Published 27 Sep 2018 in cs.LG, cs.AI, cs.CV, and stat.ML | (1809.10749v2)

Abstract: We identify a class of over-parameterized deep neural networks with standard activation functions and cross-entropy loss which provably have no bad local valley, in the sense that from any point in parameter space there exists a continuous path on which the cross-entropy loss is non-increasing and gets arbitrarily close to zero. This implies that these networks have no sub-optimal strict local minima.