Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Two Sides of The Same Coin: Bridging Deep Equilibrium Models and Neural ODEs via Homotopy Continuation (2310.09583v2)

Published 14 Oct 2023 in cs.LG and stat.ML

Abstract: Deep Equilibrium Models (DEQs) and Neural Ordinary Differential Equations (Neural ODEs) are two branches of implicit models that have achieved remarkable success owing to their superior performance and low memory consumption. While both are implicit models, DEQs and Neural ODEs are derived from different mathematical formulations. Inspired by homotopy continuation, we establish a connection between these two models and illustrate that they are actually two sides of the same coin. Homotopy continuation is a classical method of solving nonlinear equations based on a corresponding ODE. Given this connection, we proposed a new implicit model called HomoODE that inherits the property of high accuracy from DEQs and the property of stability from Neural ODEs. Unlike DEQs, which explicitly solve an equilibrium-point-finding problem via Newton's methods in the forward pass, HomoODE solves the equilibrium-point-finding problem implicitly using a modified Neural ODE via homotopy continuation. Further, we developed an acceleration method for HomoODE with a shared learnable initial point. It is worth noting that our model also provides a better understanding of why Augmented Neural ODEs work as long as the augmented part is regarded as the equilibrium point to find. Comprehensive experiments with several image classification tasks demonstrate that HomoODE surpasses existing implicit models in terms of both accuracy and memory consumption.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (51)
  1. Transversal mappings and flows. WA Benjamin New York, 1967.
  2. Handbook of mathematical functions with formulas, graphs, and mathematical tables, 1988.
  3. Deep equilibrium networks are sensitive to initialization statistics. In International Conference on Machine Learning, pages 136–160. PMLR, 2022.
  4. Optnet: Differentiable optimization as a layer in neural networks. In International Conference on Machine Learning, pages 136–145. PMLR, 2017.
  5. Deep equilibrium optical flow estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 620–630, 2022.
  6. Deep equilibrium models. Advances in Neural Information Processing Systems, 32, 2019.
  7. Multiscale deep equilibrium models. Advances in Neural Information Processing Systems, 33:5238–5250, 2020.
  8. Neural deep equilibrium solvers. In International Conference on Learning Representations, 2021.
  9. Stabilizing equilibrium models by jacobian regularization. In International Conference on Machine Learning, pages 554–565. PMLR, 2021.
  10. Neural ordinary differential equations. Advances in neural information processing systems, 31, 2018.
  11. Homotopy continuation method for solving systems of nonlinear and polynomial equations. Communications in Information and Systems, 15(2):119–307, 2015.
  12. Augmented neural odes. Advances in neural information processing systems, 32, 2019.
  13. How to train your neural ode: the world of jacobian and kinetic regularization. In International conference on machine learning, pages 3154–3164. PMLR, 2020.
  14. Simplifying hamiltonian and lagrangian neural networks via explicit constraints. Advances in neural information processing systems, 33:13880–13889, 2020.
  15. Steer: Simple temporal regularization for neural ode. Advances in Neural Information Processing Systems, 33:14831–14843, 2020.
  16. Deep equilibrium architectures for inverse problems in imaging. IEEE Transactions on Computational Imaging, 7:1123–1133, 2021.
  17. Connections between deep equilibrium and sparse representation models with application to hyperspectral image denoising. IEEE Transactions on Image Processing, 32:1513–1528, 2023.
  18. Ffjord: Free-form continuous dynamics for scalable reversible generative models. arXiv preprint arXiv:1810.01367, 2018.
  19. Hamiltonian neural networks. Advances in neural information processing systems, 32, 2019.
  20. Implicit graph neural networks. Advances in Neural Information Processing Systems, 33:11984–11995, 2020.
  21. A homotopy method based on weno schemes for solving steady state problems of hyperbolic conservation laws. Journal of Computational Physics, 250:332–346, 2013.
  22. A homotopy method with adaptive basis selection for computing multiple solutions of differential equations. Journal of Scientific Computing, 82:1–17, 2020.
  23. Spatial pattern formation in reaction–diffusion models: a computational approach. Journal of Mathematical Biology, 80:521–543, 2020.
  24. An adaptive homotopy method for computing bifurcations of nonlinear parametric systems. Journal of Scientific Computing, 82:1–19, 2020.
  25. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  26. Learning to solve hard minimal problems. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5532–5542, 2022.
  27. Hompinns: Homotopy physics-informed neural networks for learning multiple solutions of nonlinear elliptic differential equations. Computers & Mathematics with Applications, 121:62–73, 2022.
  28. The MathWorks Inc. Matlab version: 9.13.0 (r2022b), 2022.
  29. Edward L Ince. Ordinary differential equations. Courier Corporation, 1956.
  30. Kenji Kawaguchi. On the theory of implicit deep learning: Global convergence with implicit layers. In International Conference on Learning Representations (ICLR), 2021.
  31. Learning differential equations that are easy to solve. Advances in Neural Information Processing Systems, 33:4370–4380, 2020.
  32. Neural controlled differential equations for irregular time series. Advances in Neural Information Processing Systems, 33:6696–6707, 2020.
  33. Homotopy-based training of neuralodes for accurate dynamics discovery. arXiv preprint arXiv:2210.01407, 2022.
  34. Learning multiple layers of features from tiny images. 2009.
  35. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
  36. Symplectic adjoint method for exact gradient of neural ode with minimal memory. Advances in Neural Information Processing Systems, 34:20772–20784, 2021.
  37. Mohammed Ali mnmoustafa. Tiny imagenet, 2017.
  38. Reading digits in natural images with unsupervised feature learning. 2011.
  39. Numerical optimization. Springer, 1999.
  40. Continuous deep equilibrium models: Training neural odes faster by integrating them to infinity. arXiv preprint arXiv:2201.12240, 2022.
  41. Deep equilibrium approaches to diffusion models. Advances in Neural Information Processing Systems, 35:37975–37990, 2022.
  42. Lev Semenovich Pontryagin. Mathematical theory of optimal processes. CRC press, 1987.
  43. Universal differential equations for scientific machine learning. arXiv preprint arXiv:2001.04385, 2020.
  44. Latent ordinary differential equations for irregularly-sampled time series. Advances in neural information processing systems, 32, 2019.
  45. Alternating differentiation for optimization layers. International Conference on Learning Representations, 2023.
  46. Monotone operator equilibrium networks. Advances in neural information processing systems, 33:10718–10728, 2020.
  47. Optimization induced equilibrium networks: An explicit optimization perspective for understanding equilibrium models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3):3604–3616, 2023.
  48. Augmenting physical models with deep networks for complex dynamics forecasting. Journal of Statistical Mechanics: Theory and Experiment, 2021(12):124012, 2021.
  49. Learning efficient and robust ordinary differential equations via invertible neural networks. In International Conference on Machine Learning, pages 27060–27074. PMLR, 2022.
  50. Symplectic ode-net: Learning hamiltonian dynamics with control. In International Conference on Learning Representations.
  51. Adaptive checkpoint adjoint method for gradient estimation in neural ode. In International Conference on Machine Learning, pages 11639–11649. PMLR, 2020.

Summary

We haven't generated a summary for this paper yet.