Quantifying the benefit of solver differentiability for unrolled training

Determine the extent to which using a differentiable numerical solver, as opposed to a non-differentiable solver, improves the accuracy of neural networks trained via unrolled trajectories to evolve partial differential equation dynamics in hybrid simulator architectures (solver-in-the-loop training).

Background

Unrolled training of autoregressive neural simulators often relies on differentiable or adjoint numerical solvers to backpropagate gradients through time, which prior studies report can improve performance for longer optimization horizons. However, many existing scientific and engineering code bases are not differentiable, raising practical questions about the necessity and impact of solver differentiability when integrating machine learning with established numerical methods.

The paper contrasts differentiable unrolling (full gradient propagation through solver-network chains) with non-differentiable unrolling (where gradients are truncated at solver steps) to disentangle data shift reduction from long-term gradient effects. Within this context, the authors explicitly highlight the need to establish how solver differentiability contributes to training accuracy.

References

A resulting open question is how much the numerical solver's differentiability assists in training accurate networks.

— Differentiability in Unrolled Training of Neural Physics Simulators on Transient Dynamics (2402.12971 - List et al., 2024) in Related work, Unrolled training paragraph

Quantifying the benefit of solver differentiability for unrolled training

Background

References

Related Problems