2000 character limit reached
A note on the adjoint method for neural ordinary differential equation network (2402.15141v1)
Published 23 Feb 2024 in math.NA, cs.LG, and cs.NA
Abstract: Perturbation and operator adjoint method are used to give the right adjoint form rigourously. From the derivation, we can have following results: 1) The loss gradient is not an ODE, it is an integral and we shows the reason; 2) The traditional adjoint form is not equivalent with the back propagation results. 3) The adjoint operator analysis shows that if and only if the discrete adjoint has the same scheme with the discrete neural ODE, the adjoint form would give the same results as BP does.
- Neural ordinary differential equations. In Advances in neural information processing systems, pages 6571–6583, 2018.
- Daniel Liberzon. Calculus of variations and optimal control theory: a concise introduction. Princeton University Press, 2011.
- ” hey, that’s not an ode”: Faster ode adjoints with 12 lines of code. arXiv preprint arXiv:2009.09457, 2020.
- Ordinary differential equations on graph networks. 2019.
- Revealing hidden dynamics from time-series data by odenet. arXiv preprint arXiv:2005.04849, 2020.