Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Homotopy-based training of NeuralODEs for accurate dynamics discovery (2210.01407v6)

Published 4 Oct 2022 in cs.LG, math.DS, math.OC, and physics.app-ph

Abstract: Neural Ordinary Differential Equations (NeuralODEs) present an attractive way to extract dynamical laws from time series data, as they bridge neural networks with the differential equation-based modeling paradigm of the physical sciences. However, these models often display long training times and suboptimal results, especially for longer duration data. While a common strategy in the literature imposes strong constraints to the NeuralODE architecture to inherently promote stable model dynamics, such methods are ill-suited for dynamics discovery as the unknown governing equation is not guaranteed to satisfy the assumed constraints. In this paper, we develop a new training method for NeuralODEs, based on synchronization and homotopy optimization, that does not require changes to the model architecture. We show that synchronizing the model dynamics and the training data tames the originally irregular loss landscape, which homotopy optimization can then leverage to enhance training. Through benchmark experiments, we demonstrate our method achieves competitive or better training loss while often requiring less than half the number of training epochs compared to other model-agnostic techniques. Furthermore, models trained with our method display better extrapolation capabilities, highlighting the effectiveness of our method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (72)
  1. Dynamical State and Parameter Estimation. SIAM Journal on Applied Dynamical Systems, 8(4):1341–1381, 2009.
  2. Estimation of parameters in nonlinear systems using balanced synchronization. Physical Review E, 77(1):016208, 2008.
  3. E. L. Allgower and K. Georg. Numerical Continuation Methods: An Introduction. Springer Berlin Heidelberg, Berlin, Heidelberg, 1990.
  4. Unitary Evolution Recurrent Neural Networks. In Proceedings of The 33rd International Conference on Machine Learning, pages 1120–1128. PMLR, 2016.
  5. Fitting ordinary differential equations to chaotic data. Physical Review A, 45(8):5524–5529, 1992.
  6. Curriculum learning. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML ’09, pages 41–48, New York, NY, USA, 2009. Association for Computing Machinery.
  7. L. Biewald. Experiment tracking with weights and biases, 2020. Software available from wandb.com.
  8. H. G. Bock. Recent Advances in Parameter Identification Techniques for O.D.E. In P. Deuflhard and E. Hairer, editors, Numerical Treatment of Inverse Problems in Differential and Integral Equations: Proceedings of an International Workshop, Heidelberg, Fed. Rep. of Germany, August 30 — September 3, 1982, Progress in Scientific Computing, pages 95–121. Birkhäuser, Boston, MA, 1983.
  9. AntisymmetricRNN: A Dynamical System View on Recurrent Neural Networks. In International Conference on Learning Representations, volume 7, 2019.
  10. Q. Chen and W. Hao. A homotopy training algorithm for fully connected neural networks. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 475(2231):20190662, Nov. 2019.
  11. Neural Ordinary Differential Equations. In Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018.
  12. Ode to an ODE. In Advances in Neural Information Processing Systems, volume 33, pages 3338–3350. Curran Associates, Inc., 2020.
  13. Homotopy continuation methods for neural networks. In 1991., IEEE International Sympoisum on Circuits and Systems, volume 5, pages 2483–2486, 1991.
  14. K. Doya. Bifurcations of recurrent neural networks in gradient descent learning. IEEE Transactions on neural networks, 1(75):218–229, 1993.
  15. Homotopy optimization methods for global optimization. Technical Report SAND2005-7495, Sandia National Laboratories (SNL), Albuquerque, NM, and Livermore, CA (United States), 2005.
  16. W. Falcon and The PyTorch Lightning team. PyTorch Lightning, Mar. 2019.
  17. How to Train Your Neural ODE: The World of Jacobian and Kinetic Regularization. In Proceedings of the 37th International Conference on Machine Learning, volume 37, pages 3154–3164. PMLR, 2020.
  18. Simplifying Hamiltonian and Lagrangian Neural Networks via Explicit Constraints. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 13880–13889. Curran Associates, Inc., 2020.
  19. An Investigation into Neural Net Optimization via Hessian Eigenvalue Density. In Proceedings of the 36th International Conference on Machine Learning, pages 2232–2241. PMLR, 2019.
  20. STEER : Simple Temporal Regularization For Neural ODE. In Advances in Neural Information Processing Systems, volume 33, pages 14831–14843. Curran Associates, Inc., 2020.
  21. Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach. Number 58 in Monographs on Statistics and Applied Probability. Chapman & Hall /CRC, Boca Raton, Fla., 2000.
  22. Hamiltonian Neural Networks. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
  23. Deconstructing the Inductive Biases of Hamiltonian Neural Networks. In International Conference on Learning Representations, volume 9, 2021.
  24. Mollifying Networks. In International Conference on Learning Representations, volume 5, 2017.
  25. Liquid Time-constant Networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 7657–7666, 2021.
  26. J.-H. He. Homotopy perturbation technique. Computer Methods in Applied Mechanics and Engineering, 178(3-4):257–262, 1999.
  27. Single Loop Gaussian Homotopy Method for Non-convex Optimization. Advances in Neural Information Processing Systems, 35:7065–7076, 2022.
  28. Synchronization and imposed bifurcations in the presence of large parameter mismatch. Physical Review Letters, 80(18):3956–3959, 1998.
  29. Preventing Gradient Explosions in Gated Recurrent Units. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.
  30. "Hey, that’s not an ODE": Faster ODE Adjoints via Seminorms. In Proceedings of the 38th International Conference on Machine Learning, pages 5443–5452. PMLR, 2021.
  31. Stiff neural ordinary differential equations. Chaos: An Interdisciplinary Journal of Nonlinear Science, 31(9):093122, 2021.
  32. Characterizing possible failure modes in physics-informed neural networks. In Advances in Neural Information Processing Systems, volume 34, pages 26548–26560. Curran Associates, Inc., 2021.
  33. Simulated Annealing: Theory and Applications. Springer Netherlands, Dordrecht, 1987.
  34. M. Lezcano-Casado and D. Martínez-Rubio. Cheap Orthogonal Constraints in Neural Networks: A Simple Parametrization of the Orthogonal and Unitary Group. In Proceedings of the 36th International Conference on Machine Learning, pages 3794–3803. PMLR, 2019.
  35. Visualizing the Loss Landscape of Neural Nets. In Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018.
  36. Continuation path learning for homotopy optimization. In Proceedings of the 40th International Conference on Machine Learning, pages 21288–21311. PMLR, 2023.
  37. I. Loshchilov and F. Hutter. Decoupled Weight Decay Regularization. In International Conference on Learning Representations, volume 7, 2019.
  38. Differentiable Multiple Shooting Layers. In Advances in Neural Information Processing Systems, volume 34, pages 16532–16544. Curran Associates, Inc., 2021.
  39. Symplectic Adjoint Method for Exact Gradient of Neural ODE with Minimal Memory. In Advances in Neural Information Processing Systems, volume 34, pages 20772–20784. Curran Associates, Inc., 2021.
  40. A. Maybhate and R. E. Amritkar. Use of synchronization and adaptive control in parameter estimation from a time series. Physical Review E, 59(1):284–293, 1999.
  41. On the difficulty of learning chaotic dynamics with RNNs. Advances in Neural Information Processing Systems, 35:11297–11312, 2022.
  42. J. Miller and M. Hardt. Stable Recurrent Models. In International Conference on Learning Representations, volume 7, 2019.
  43. J. D. Murray. Mathematical Biology. 1: An Introduction. Number 17 in Interdisciplinary Applied Mathematics. Springer-Verlag GmbH, Berlin Heidelberg, softcover reprint of the hardcover 3rd edition 2002, corrected second printing edition, 2004.
  44. U. Parlitz. Estimating Model Parameters from Time Series by Autosynchronization. Physical Review Letters, 76(8):1232–1235, 1996.
  45. On the difficulty of training recurrent neural networks. In Proceedings of the 30th International Conference on Machine Learning, pages 1310–1318. PMLR, 2013.
  46. Pytorch: An imperative style, high-performance deep learning library. arXiv preprint arXiv:1912.01703, 2019.
  47. Synchronization in chaotic systems. Physical Review Letters, 64(8):821–824, 1990.
  48. Driving systems with chaotic signals. Physical Review A, 44(4):2374–2383, 1991.
  49. Synchronization of chaotic systems. Chaos: An Interdisciplinary Journal of Nonlinear Science, 25(9):097611, 2015.
  50. M. Peifer and J. Timmer. Parameter estimation in ordinary differential equations for biochemical processes using the method of multiple shooting. IET Systems Biology, 1(2):78–88, 2007.
  51. Synchronizing hyperchaos with a scalar transmitted signal. Physical Review Letters, 76(6):904–907, 1996.
  52. Stochastic state-feedback control using homotopy optimization and particle filtering. International Journal of Dynamics and Control, 10(3):942–955, 2022.
  53. Parameter and state estimation of experimental chaotic systems using synchronization. Physical Review E, 80(1):016201, 2009.
  54. DiffEqFlux.jl - A Julia Library for Neural Differential Equations, 2019.
  55. Universal Differential Equations for Scientific Machine Learning, 2021.
  56. Parameter estimation for differential equations: A generalized smoothing approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(5):741–796, 2007.
  57. Beyond exploding and vanishing gradients: Analysing RNN training using attractors and smoothness. In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, pages 2370–2380. PMLR, 2020.
  58. A Combined Homotopy-Optimization Approach to Parameter Identification for Dynamical Systems. PAMM, 19(1), 2019.
  59. M. Schmidt and H. Lipson. Distilling Free-Form Natural Laws from Experimental Data. Science, 324(5923):81–85, 2009.
  60. N. B. Toomarian and J. Barhen. Learning a trajectory using adjoint functions and teacher forcing. Neural Networks, 5(3):473–484, 1992.
  61. E. M. Turan and J. Jäschke. Multiple shooting for training neural differential equations on time series. IEEE Control Systems Letters, 6:1897–1902, 2022.
  62. B. van Domselaar and P. Hemker. Nonlinear parameter estimation in initial value problems. Technical Report MC-NW-75-18, Amsterdam Univ. Math. Cent. Num. Sci., 1975.
  63. Minimum attention stochastic control with homotopy optimization. International Journal of Dynamics and Control, 9(1):266–274, 2021.
  64. Nonlinear Dynamical System Identification from Uncertain and Indirect Measurements. International Journal of Bifurcation and Chaos, 14(06):1905–1933, 2004.
  65. Parameter identification in dynamic systems using the homotopy optimization approach. Multibody System Dynamics, 26(4):411–424, 2011.
  66. Single-shooting homotopy method for parameter identification in dynamical systems. Physical Review E, 85(3):036201, 2012.
  67. L. T. Watson. Globally convergent homotopy methods: A tutorial. Applied Mathematics and Computation, 31:369–396, May 1989.
  68. Modern homotopy methods in optimization. Computer Methods in Applied Mechanics and Engineering, 74(3):289–305, 1989.
  69. PyHessian: Neural Networks Through the Lens of the Hessian. In 2020 IEEE International Conference on Big Data (Big Data), pages 581–590. IEEE, 2020.
  70. ODE2VAE: Deep generative second order ODEs with Bayesian neural networks. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
  71. Augmenting physical models with deep networks for complex dynamics forecasting. Journal of Statistical Mechanics: Theory and Experiment, 2021(12):124012, 2021.
  72. Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODE. In Proceedings of the 37th International Conference on Machine Learning, pages 11639–11649. PMLR, 2020.
Citations (8)

Summary

We haven't generated a summary for this paper yet.