Adaptive NAG that is both faster than gradient descent and stable

Determine whether there exist adaptive implementations of Nesterov accelerated gradient (NAG) that vary the momentum parameter β and the step size Δs during iterations and are both faster than standard gradient descent and stable when applied to adjoint-based optimization of the time-dependent electric field E(t) controlling defibrillation in the two-dimensional Fenton–Karma cardiac tissue model.

Background

The paper uses adjoint optimization to design ultra-low-energy defibrillation signals by minimizing a cost functional over time-dependent electric fields E(t). Standard gradient descent can fail to find defibrillating protocols over long temporal horizons (e.g., T = 600 ms), motivating the use of Nesterov accelerated gradient (NAG) to improve convergence.

While NAG enabled the discovery of low-energy, defibrillating protocols where standard gradient descent failed, the authors observed sensitivity to the choice of hyperparameters (momentum β and step size Δs), with stability issues arising for certain settings. This motivates the explicit question of whether adaptive schemes for β and Δs can guarantee both speed gains over gradient descent and stability in this adjoint-optimization context.

References

It remains to be explored whether there are adaptive implementations of NAG with variable β and Δs that are both faster than standard gradient descent and stable.

Ultra-low-energy defibrillation through adjoint optimization  (2407.05115 - Garzon et al., 2024) in Section 3.4 (Improved gradient descent)