Gradient descent for computing optimal controls in neural ODEs
Develop detailed, rigorous results on the use of gradient descent to compute optimal controls—i.e., parameter paths (U_s, V_s, b_s) over s ∈ [0,1]—for neural ordinary differential equation models that map inputs x^i to outputs y^i via the terminal state x_{s=1}, including convergence guarantees and characterization of the obtained controls.
References
However, the specificity of learning theory compared to control theory lies in the goal of computing such control using gradient descent eq:grad-desc. To date, no detailed results exist on this.
— The Mathematics of Artificial Intelligence
(2501.10465 - Peyré, 15 Jan 2025) in Section “Very Deep Networks”, Neural Differential Equation paragraph