Grassmannian Geometry and Global Convergence of Variable Projection for Neural Networks
Abstract: Training deep neural networks and Physics-Informed Neural Networks (PINNs) often leads to ill-conditioned and stiff optimization problems. A key structural feature of these models is that they are linear in the output-layer parameters and nonlinear in the hiddenlayer parameters, yielding a separable nonlinear least-squares formulation. In this work, we study the classical variable projection (VarPro) method for such problems in the context of deep neural networks. We provide a geometric formulation on the Grassmannian and analyze the structure of critical points and convergence properties of the reduced problem. When the feature map is parametrized by a neural network, we show that these properties persist except in rank-deficient regimes, which we address via a regularized Grassmannian framework. Numerical experiments for regression and PINNs, including an efficient solver for the heat equation, illustrate the practical effectiveness of the approach.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.