Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Subhomogeneous Deep Equilibrium Models (2403.00720v2)

Published 1 Mar 2024 in cs.LG, cs.NA, math.NA, and math.OC

Abstract: Implicit-depth neural networks have grown as powerful alternatives to traditional networks in various applications in recent years. However, these models often lack guarantees of existence and uniqueness, raising stability, performance, and reproducibility issues. In this paper, we present a new analysis of the existence and uniqueness of fixed points for implicit-depth neural networks based on the concept of subhomogeneous operators and the nonlinear Perron-Frobenius theory. Compared to previous similar analyses, our theory allows for weaker assumptions on the parameter matrices, thus yielding a more flexible framework for well-defined implicit networks. We illustrate the performance of the resulting subhomogeneous networks on feedforward, convolutional, and graph neural network examples.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. Layer normalization, 2016.
  2. Neural ordinary differential equations. Advances in neural information processing systems, 32:688–699, 2019.
  3. Multiscale deep equilibrium models, 2020.
  4. Deep equilibrium optical flow estimation, 2022.
  5. Equilibrium image denoising with implicit differentiation. IEEE Transactions on Image Processing, 32:1868–1881, 2023. doi: 10.1109/TIP.2023.3255104.
  6. Neural ordinary differential equations. Advances in neural information processing systems, 31:6571–6583, 2018.
  7. Semialgebraic representation of monotone deep equilibrium models and applications to certification, 2021.
  8. Clarke, F. H. Optimization and Nonsmooth Analysis. Society for Industrial and Applied Mathematics, 1990. doi: 10.1137/1.9781611971309. URL https://epubs.siam.org/doi/abs/10.1137/1.9781611971309.
  9. Recurrent stacking of layers for compact neural machine translation models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pp.  6292–6299, 2019.
  10. Universal transformers. In International Conference on Learning Representations (ICLR), 2019.
  11. Generalizable adversarial training via spectral normalization, 2018.
  12. Predict then propagate: Graph neural networks meet personalized pagerank, 2022.
  13. The perron–frobenius theorem for multihomogeneous mappings. SIAM Journal on Matrix Analysis and Applications, 40(3):1179–1205, 2019.
  14. Implicit deep learning, 2020.
  15. Deep equilibrium architectures for inverse problems in imaging, 2021.
  16. On the trade-off between over-smoothing and over-squashing in deep graph neural networks. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, CIKM ’23. ACM, October 2023. doi: 10.1145/3583780.3614997. URL http://dx.doi.org/10.1145/3583780.3614997.
  17. Implicit graph neural networks, 2021.
  18. Batch normalization: Accelerating deep network training by reducing internal covariate shift, 2015.
  19. Robust implicit networks via non-euclidean contractions, 2022.
  20. Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario, 2009. URL https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf.
  21. MNIST handwritten digit database. 2010. URL http://yann.lecun.com/exdb/mnist/.
  22. Nonlinear Perron–Frobenius Theory. Cambridge Tracts in Mathematics. Cambridge University Press, 2012. doi: 10.1017/CBO9781139026079.
  23. Recurrence without recurrence: Stable video landmark detection with deep equilibrium models, 2023.
  24. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011, 2011. URL http://ufldl.stanford.edu/housenumbers/nips2011_housenumbers.pdf.
  25. Revisiting over-smoothing and over-squashing using ollivier-ricci curvature, 2023.
  26. Deep equilibrium approaches to diffusion models, 2022.
  27. Coupled oscillatory recurrent neural network (cornn): An accurate and (gradient) stable architecture for learning long time dependencies. In International Conference on Learning Representations, 2021.
  28. Primer on monotone operator methods. Appl. comput. math, 15(1):3–43, 2016.
  29. Certified robustness for deep equilibrium models via interval bound propagation. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=y1PXylgrXZ.
  30. Monotone operator equilibrium networks. Advances in neural information processing systems, 33:10718–10728, 2020.
  31. A closer look at the adversarial robustness of deep equilibrium models. In Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., and Oh, A. (eds.), Advances in Neural Information Processing Systems, volume 35, pp.  10448–10461. Curran Associates, Inc., 2022. URL https://proceedings.neurips.cc/paper_files/paper/2022/file/43da8cca8f14139774bcbd935d51e0f2-Paper-Conference.pdf.
  32. Rethinking lipschitz neural networks and certified robustness: A boolean function perspective, 2022.
  33. Deep equilibrium models for video snapshot compressive imaging, 2023.

Summary

We haven't generated a summary for this paper yet.