Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
GPT-4o
Gemini 2.5 Pro Pro
o3 Pro
GPT-4.1 Pro
DeepSeek R1 via Azure Pro
2000 character limit reached

Genericity of Polyak-Lojasiewicz Inequalities for Entropic Mean-Field Neural ODEs (2507.08486v1)

Published 11 Jul 2025 in math.OC, math.AP, and math.PR

Abstract: We address the behavior of idealized deep residual neural networks (ResNets), modeled via an optimal control problem set over continuity (or adjoint transport) equations. The continuity equations describe the statistical evolution of the features in the asymptotic regime where the layers of the network form a continuum. The velocity field is expressed through the network activation function, which is itself viewed as a function of the statistical distribution of the network parameters (weights and biases). From a mathematical standpoint, the control is interpreted in a relaxed sense, taking values in the space of probability measures over the set of parameters. We investigate the optimal behavior of the network when the cost functional arises from a regression problem and includes an additional entropic regularization term on the distribution of the parameters. In this framework, we focus in particular on the existence of stable optimizers --that is, optimizers at which the Hessian of the cost is non-degenerate. We show that, for an open and dense set of initial data, understood here as probability distributions over features and associated labels, there exists a unique stable global minimizer of the control problem. Moreover, we show that such minimizers satisfy a local Polyak--Lojasiewicz inequality, which can lead to exponential convergence of the corresponding gradient descent when the initialization lies sufficiently close to the optimal parameters. This result thus demonstrates the genericity (with respect to the distribution of features and labels) of the Polyak--Lojasiewicz condition in ResNets with a continuum of layers and under entropic penalization.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.