LipKernel: Lipschitz-Bounded Convolutional Neural Networks via Dissipative Layers (2410.22258v1)
Abstract: We propose a novel layer-wise parameterization for convolutional neural networks (CNNs) that includes built-in robustness guarantees by enforcing a prescribed Lipschitz bound. Each layer in our parameterization is designed to satisfy a linear matrix inequality (LMI), which in turn implies dissipativity with respect to a specific supply rate. Collectively, these layer-wise LMIs ensure Lipschitz boundedness for the input-output mapping of the neural network, yielding a more expressive parameterization than through spectral bounds or orthogonal layers. Our new method LipKernel directly parameterizes dissipative convolution kernels using a 2-D Roesser-type state space model. This means that the convolutional layers are given in standard form after training and can be evaluated without computational overhead. In numerical experiments, we show that the run-time using our method is orders of magnitude faster than state-of-the-art Lipschitz-bounded networks that parameterize convolutions in the Fourier domain, making our approach particularly attractive for improving robustness of learning-based real-time perception or control in robotics, autonomous vehicles, or automation systems. We focus on CNNs, and in contrast to previous works, our approach accommodates a wide variety of layers typically used in CNNs, including 1-D and 2-D convolutional layers, maximum and average pooling layers, as well as strided and dilated convolutions and zero padding. However, our approach naturally extends beyond CNNs as we can incorporate any layer that is incrementally dissipative.
- Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, pp. 436–444, 2015.
- C. M. Bishop, “Neural networks and their applications,” Review of scientific instruments, vol. 65, no. 6, pp. 1803–1832, 1994.
- Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A survey of convolutional neural networks: analysis, applications, and prospects,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 12, pp. 6999–7019, 2021.
- C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” arXiv:1312.6199, 2013.
- A. Virmaux and K. Scaman, “Lipschitz regularity of deep neural networks: analysis and efficient estimation,” Advances in Neural Information Processing Systems, vol. 31, 2018.
- P. L. Combettes and J.-C. Pesquet, “Lipschitz certificates for layered network structures driven by averaged activation operators,” SIAM Journal on Mathematics of Data Science, vol. 2, no. 2, pp. 529–557, 2020.
- M. Fazlyab, A. Robey, H. Hassani, M. Morari, and G. Pappas, “Efficient and accurate estimation of Lipschitz constants for deep neural networks,” Advances in Neural Information Processing Systems, vol. 32, 2019.
- F. Latorre, P. Rolland, and V. Cevher, “Lipschitz constant estimation of neural networks via sparse polynomial optimization,” in International Conference on Learning Representations, 2020.
- M. Jordan and A. G. Dimakis, “Exactly computing the local Lipschitz constant of ReLU networks,” in Advances in Neural Information Processing Systems, 2020, pp. 7344–7353.
- M. Revay, R. Wang, and I. R. Manchester, “A convex parameterization of robust recurrent neural networks,” IEEE Control Systems Letters, vol. 5, no. 4, pp. 1363–1368, 2020.
- P. Pauli, D. Gramlich, and F. Allgöwer, “Lipschitz constant estimation for 1d convolutional neural networks,” in Learning for Dynamics and Control Conference. PMLR, 2023, pp. 1321–1332.
- ——, “Lipschitz constant estimation for general neural network architectures using control tools,” arXiv:2405.01125, 2024.
- C. Anil, J. Lucas, and R. Grosse, “Sorting out Lipschitz function approximation,” in International Conference on Machine Learning. PMLR, 2019, pp. 291–301.
- B. Prach and C. H. Lampert, “Almost-orthogonal layers for efficient general-purpose Lipschitz networks,” in Computer Vision–ECCV 2022: 17th European Conference, 2022.
- P. Pauli, A. Koch, J. Berberich, P. Kohler, and F. Allgöwer, “Training robust neural networks using Lipschitz bounds,” IEEE Control Systems Letters, vol. 6, pp. 121–126, 2021.
- P. Pauli, N. Funcke, D. Gramlich, M. A. Msalmi, and F. Allgöwer, “Neural network training under semidefinite constraints,” in 61st Conference on Decision and Control. IEEE, 2022, pp. 2731–2736.
- H. Gouk, E. Frank, B. Pfahringer, and M. J. Cree, “Regularisation of neural networks by enforcing Lipschitz continuity,” Machine Learning, vol. 110, pp. 393–416, 2021.
- M. Revay, R. Wang, and I. R. Manchester, “Lipschitz bounded equilibrium networks,” arXiv:2010.01732, 2020.
- ——, “Recurrent equilibrium networks: Flexible dynamic models with guaranteed stability and robustness,” IEEE Transactions on Automatic Control, 2023.
- R. Wang and I. Manchester, “Direct parameterization of Lipschitz-bounded deep networks,” in International Conference on Machine Learning. PMLR, 2023, pp. 36 093–36 110.
- P. Pauli, R. Wang, I. R. Manchester, and F. Allgöwer, “Lipschitz-bounded 1D convolutional neural networks using the Cayley transform and the controllability Gramian,” in 62nd Conference on Decision and Control. IEEE, 2023, pp. 5345–5350.
- A. Trockman and J. Z. Kolter, “Orthogonalizing convolutional layers with the cayley transform,” in International Conference on Learning Representations, 2021.
- R. Roesser, “A discrete state-space model for linear image processing,” IEEE Transactions on Automatic Control, vol. 20, no. 1, 1975.
- D. Gramlich, P. Pauli, C. W. Scherer, F. Allgöwer, and C. Ebenbauer, “Convolutional neural networks as 2-d systems,” arXiv:2303.03042, 2023.
- P. Pauli, D. Gramlich, and F. Allgöwer, “State space representations of the Roesser type for convolutional layers,” arXiv:2403.11938, 2024.
- S. Singla, S. Singla, and S. Feizi, “Improved deterministic l2 robustness on CIFAR-10 and CIFAR-100,” in International Conference on Learning Representations, 2022.
- R. T. Chen, J. Behrmann, D. K. Duvenaud, and J.-H. Jacobsen, “Residual flows for invertible generative modeling,” Advances in Neural Information Processing Systems, vol. 32, 2019.
- J. Behrmann, W. Grathwohl, R. T. Chen, D. Duvenaud, and J.-H. Jacobsen, “Invertible residual networks,” in International conference on machine learning. PMLR, 2019, pp. 573–582.
- Y. Perugachi-Diaz, J. Tomczak, and S. Bhulai, “Invertible densenets with concatenated lipswish,” Advances in Neural Information Processing Systems, vol. 34, pp. 17 246–17 257, 2021.
- R. Wang, K. Dvijotham, and I. R. Manchester, “Monotone, bi-Lipschitz, and Polyak- Łojasiewicz networks,” in International Conference on Machine Learning. PMLR, 2024.
- C. I. Byrnes and W. Lin, “Losslessness, feedback equivalence, and the global stabilization of discrete-time nonlinear systems,” IEEE Transactions on Automatic Control, vol. 39, no. 1, pp. 83–98, 1994.
- B. Aquino, A. Rahnama, P. Seiler, L. Lin, and V. Gupta, “Robustness against adversarial attacks in neural networks using incremental dissipativity,” IEEE Control Systems Letters, vol. 6, pp. 2341–2346, 2022.
- M. Fazlyab, M. Morari, and G. J. Pappas, “Safety verification and robustness analysis of neural networks via quadratic constraints and semidefinite programming,” IEEE Transactions on Automatic Control, 2020.
- V. V. Kulkarni and M. G. Safonov, “Incremental positivity nonpreservation by stability multipliers,” IEEE Transactions on Automatic Control, vol. 47, no. 1, pp. 173–177, 2002.
- B.-Z. Guo and H. Zwart, “On the relation between stability of continuous-and discrete-time evolution equations via the cayley transform,” Integral Equations and Operator Theory, vol. 54, pp. 349–383, 2006.
- K. Helfrich, D. Willmott, and Q. Ye, “Orthogonal recurrent neural networks with scaled Cayley transform,” in International Conference on Machine Learning. PMLR, 2018, pp. 1969–1978.
- R. Shepard, S. R. Brozell, and G. Gidofalvi, “The representation and parametrization of orthogonal matrices,” The Journal of Physical Chemistry A, vol. 119, no. 28, pp. 7924–7939, 2015.
- B. Anderson, P. Agathoklis, E. Jury, and M. Mansour, “Stability and the matrix lyapunov equation for discrete 2-dimensional systems,” IEEE Transactions on Circuits and Systems, vol. 33, no. 3, pp. 261–267, 1986.
- A. Araujo, A. J. Havens, B. Delattre, A. Allauzen, and B. Hu, “A unified algebraic perspective on Lipschitz neural networks,” in International Conference on Learning Representations, 2023.
- Y. Chen, B. Zheng, Z. Zhang, Q. Wang, C. Shen, and Q. Zhang, “Deep learning on mobile and embedded devices: State-of-the-art, challenges, and future directions,” ACM Computing Surveys (CSUR), vol. 53, no. 4, pp. 1–37, 2020.
- Y. LeCun and C. Cortes, “MNIST handwritten digit database,” 2010.
- A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” arXiv:1706.06083, 2017.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.