Papers
Topics
Authors
Recent
Search
2000 character limit reached

The loss landscape of overparameterized neural networks

Published 26 Apr 2018 in cs.LG, cs.AI, cs.NE, and stat.ML | (1804.10200v1)

Abstract: We explore some mathematical features of the loss landscape of overparameterized neural networks. A priori one might imagine that the loss function looks like a typical function from $\mathbb{R}n$ to $\mathbb{R}$ - in particular, nonconvex, with discrete global minima. In this paper, we prove that in at least one important way, the loss function of an overparameterized neural network does not look like a typical function. If a neural net has $n$ parameters and is trained on $d$ data points, with $n>d$, we show that the locus $M$ of global minima of $L$ is usually not discrete, but rather an $n-d$ dimensional submanifold of $\mathbb{R}n$. In practice, neural nets commonly have orders of magnitude more parameters than data points, so this observation implies that $M$ is typically a very high-dimensional subset of $\mathbb{R}n$.

Citations (67)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.