Papers
Topics
Authors
Recent
Search
2000 character limit reached

Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances

Published 25 May 2021 in cs.LG | (2105.12221v2)

Abstract: We study how permutation symmetries in overparameterized multi-layer neural networks generate `symmetry-induced' critical points. Assuming a network with $ L $ layers of minimal widths $ r_1*, \ldots, r_{L-1}* $ reaches a zero-loss minimum at $ r_1*! \cdots r_{L-1}*! $ isolated points that are permutations of one another, we show that adding one extra neuron to each layer is sufficient to connect all these previously discrete minima into a single manifold. For a two-layer overparameterized network of width $ r*+ h =: m $ we explicitly describe the manifold of global minima: it consists of $ T(r*, m) $ affine subspaces of dimension at least $ h $ that are connected to one another. For a network of width $m$, we identify the number $G(r,m)$ of affine subspaces containing only symmetry-induced critical points that are related to the critical points of a smaller network of width $r<r*$. Via a combinatorial analysis, we derive closed-form formulas for $ T $ and $ G $ and show that the number of symmetry-induced critical subspaces dominates the number of affine subspaces forming the global minima manifold in the mildly overparameterized regime (small $ h $) and vice versa in the vastly overparameterized regime ($h \gg r*$). Our results provide new insights into the minimization of the non-convex loss function of overparameterized neural networks.

Citations (77)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.