Convergence of gradient descent for polynomial neural networks with r≥2

Establish theoretical guarantees (e.g., global convergence under suitable conditions) for the convergence trajectory of gradient descent when training polynomial neural networks with monomial activation degree r≥2.

Background

The paper surveys known results showing gradient descent converges to global minima for deep linear networks (r=1) under mild assumptions, and highlights that analogous results for nonlinear polynomial networks are currently missing.

They identify extending convergence analyses from the linear case to polynomial neural networks with higher-degree monomial activations as an open problem, noting potential geometric effects from reduced parameter symmetries at higher activation degrees.

References

The convergence trajectory of polynomial neural networks with activation degree $r \geq 2$ remains an open problem.

— Geometry of Polynomial Neural Networks (2402.00949 - Kubjas et al., 1 Feb 2024) in Section 7.4 (Case study: Linear Neural Networks)

Convergence of gradient descent for polynomial neural networks with r≥2

Sponsor

Background

References

Related Problems