Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improving Lipschitz-Constrained Neural Networks by Learning Activation Functions (2210.16222v2)

Published 28 Oct 2022 in cs.LG

Abstract: Lipschitz-constrained neural networks have several advantages over unconstrained ones and can be applied to a variety of problems, making them a topic of attention in the deep learning community. Unfortunately, it has been shown both theoretically and empirically that they perform poorly when equipped with ReLU activation functions. By contrast, neural networks with learnable 1-Lipschitz linear splines are known to be more expressive. In this paper, we show that such networks correspond to global optima of a constrained functional optimization problem that consists of the training of a neural network composed of 1-Lipschitz linear layers and 1-Lipschitz freeform activation functions with second-order total-variation regularization. Further, we propose an efficient method to train these neural networks. Our numerical experiments show that our trained networks compare favorably with existing 1-Lipschitz neural architectures.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. Learning activation functions to improve deep neural networks. International Conference on Learning Representations, 2015.
  2. Sorting out Lipschitz function approximation. In International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 291–301, 2019.
  3. On instabilities of deep learning in image reconstruction and the potential costs of AI. Proceedings of the National Academy of Sciences, 117(48):30088–30095, 2020.
  4. Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5):898–916, 2011.
  5. Wasserstein generative adversarial networks. In International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 214–223, 2017.
  6. Deep neural networks with trainable activations and controlled Lipschitz constant. IEEE Transactions on Signal Processing, 68:4688–4699, 2020.
  7. Sparsest univariate learning models under Lipschitz constraint. IEEE Open Journal of Signal Processing, 3:140–154, 2022.
  8. Spectrally-normalized margin bounds for neural networks. Advances in Neural Information Processing Systems, 31:6241–6250, 2017.
  9. Learning activation functions in deep (spline) neural networks. IEEE Open Journal of Signal Processing, 1:295–309, 2020.
  10. Learning Lipschitz-controlled activation functions in neural networks for Plug-and-Play image reconstruction methods. In NeurIPS 2021 Workshop on Deep Learning and Inverse Problems, 2021.
  11. CLIP: Cheap Lipschitz training of neural networks. In Scale Space and Variational Methods in Computer Vision, pages 307–319, 2021.
  12. InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets. Advances in Neural Information Processing Systems, 29:2172–2180, 2016.
  13. Parseval networks: Improving robustness to adversarial examples. In International Conference on Machine Learning, pages 854–863, 2017.
  14. Signal recovery by proximal forward-backward splitting. Multiscale Modeling &\&& Simulation, 4:1168–1200, 2005.
  15. Sparsest piecewise-linear regression of one-dimensional data. Journal of Computational and Applied Mathematics, 406(C):114044, 2022.
  16. Exploring the landscape of spatial robustness. International Conference on Machine Learning, 36:1802–1811, 2019.
  17. Generative adversarial nets. Advances in Neural Information Processing Systems, 27:2672–2680, 2014.
  18. Improved training of Wasserstein GANs. Advances in Neural Information Processing Systems, 30:2644–2655, 2017.
  19. Stabilizing invertible neural networks using mixture models. Inverse Problems, 37(8), 2021.
  20. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In IEEE International Conference on Computer Vision, pages 1026–1034, 2015.
  21. Convolutional proximal neural networks and plug-and-play algorithms. Linear Algebra and Its Applications, 631:203–234, 2021.
  22. Limitations of the Lipschitz constant as a defense against adversarial examples. Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 11329:16–29, 2018.
  23. Deep learning with S-shaped rectified linear activation units. AAAI Conference on Artificial Intelligence, 30:1737–1743, 2016.
  24. Adam: A method for stochastic optimization. In International Conference on Learning Representations, 2015.
  25. fastMRI: A publicly available raw k-space and DICOM dataset of knee images for accelerated MR image reconstruction using machine learning. Radiology: Artificial Intelligence, 2(1), 2020.
  26. Kantorovich strikes back! wasserstein gans are not optimal transport? In Advances in Neural Information Processing Systems, volume 35, pages 13933–13946. Curran Associates, Inc., 2022.
  27. ALICE: Towards Understanding Adversarial Learning for Joint Distribution Matching. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.
  28. Preventing Gradient Attenuation in Lipschitz Constrained Convolutional Networks. In Advances in Neural Information Processing Systems, volume 32, 2019.
  29. SwinIR: Image restoration using swin transformer. In International Conference on Computer Vision Workshops, pages 1833–1844. IEEE, 2021.
  30. Distance–based classification with Lipschitz functions. Journal of Machine Learning Research, 5:669–695, 2004.
  31. Towards deep learning models resistant to adversarial attacks. In International Conference on Machine Learning, 2019.
  32. C. McCollough. TU-FG-207A-04: Overview of the low dose CT grand challenge. Medical Physics, 43(6Part35):3759–3760, 2016. doi: https://doi.org/10.1118/1.4957556.
  33. Learning proximal operators: Using denoising networks for regularizing inverse imaging problems. In IEEE International Conference on Computer Vision, pages 1799–1808, 2017.
  34. Spectral normalization for generative adversarial networks. In International Conference on Learning Representations, 2018.
  35. Results of the 2020 fastMRI Challenge for Machine Learning MR Image Reconstruction. IEEE Transactions on Medical Imaging, 40(9):2306–2317, September 2021. ISSN 0278-0062, 1558-254X. doi: 10.1109/TMI.2021.3075856.
  36. Learned convex regularizers for inverse problems, March 2021.
  37. Model-free deep mri reconstruction: A robustness study. In ISMRM Workshop on Data Sampling and Image, 2020.
  38. Approximation of Lipschitz functions using deep spline neural networks. SIAM Journal on Mathematics of Data Science, 5(2):306–322, 2023.
  39. Exploring generalization in deep learning. Advances in Neural Information Processing Systems, 31:5949–5958, 2017.
  40. Training robust neural networks using Lipschitz bounds. IEEE Control Systems Letters, 6:121–126, 2022.
  41. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Nassir Navab, Joachim Hornegger, William M. Wells, and Alejandro F. Frangi, editors, Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, Lecture Notes in Computer Science, pages 234–241, Cham, 2015. Springer International Publishing. ISBN 978-3-319-24574-4.
  42. Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1), 2018.
  43. Plug-and-play methods provably converge with properly trained denoisers. In International Conference on Machine Learning, pages 5546–5557. PMLR, 2019.
  44. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. In International Conference on Learning Representations, 2014.
  45. Laurent Schwartz. Théorie des Distributions. Publications de l’Institut de Mathématique de l’Université de Strasbourg, IX-X. Hermann, Paris, 1966.
  46. Improved deterministic l2 robustness on CIFAR-10 and CIFAR-100. International Conference on Learning Representations, 2022.
  47. Robust large margin deep neural networks. IEEE Transactions on Signal Processing, 65(16):4265–4280, 2017.
  48. Plug-and-play priors for bright field electron tomography and sparse interpolation. IEEE Transactions on Computational Imaging, 2(4):408–423, 2016.
  49. Scaling-up Diverse Orthogonal Convolutional Networks by a Paraunitary Framework. In Proceedings of the 39th International Conference on Machine Learning, June 2022.
  50. Scalable plug-and-play ADMM with convergence guarantees. IEEE Transactions on Computational Imaging, 7:849–863, 2021.
  51. Robustness may be at odds with accuracy. In International Conference on Learning Representations, ICLR, 2019.
  52. Lipschitz-margin training: Scalable certification of perturbation invariance for deep neural networks. Advances in Neural Information Processing Systems, 31:6542–6551, 2018.
  53. Software toolbox and programming library for compressed sensing and parallel imaging. In ISMRM Workshop on Data Sampling and Image Reconstruction, page 41, 2013.
  54. ESPIRiT-an eigenvalue approach to autocalibrating parallel MRI: Where SENSE meets GRAPPA. Magn. Reson. Med., 71(3):990–1001, March 2014.
  55. Michael Unser. A representer theorem for deep neural networks. Journal of Machine Learning Research, 20(110):1–30, 2019.
  56. Plug-and-play priors for model based reconstruction. IEEE Global Conference on Signal and Information Processing, pages 945–948, 2013.
  57. C. Villani. Optimal Transport: Old and New. Grundlehren der Mathematischen Wissenschaften. Springer, Heidelberg, 2016.
  58. Lipschitz regularity of deep neural networks: Analysis and efficient estimation. Advances in Neural Information Processing Systems, 31:3839–3848, 2018.
  59. Deep residual learning for model-based iterative CT reconstruction using Plug-and-Play framework. In IEEE International Conference on Acoustics, Speech and Signal Processing, pages 6668–6672, 2018.
  60. Towards Certifying L-infinity Robustness using Neural Networks with L-inf-dist Neurons. In International Conference on Machine Learning, pages 12368–12379, 2021.
  61. Rethinking Lipschitz Neural Networks and Certified Robustness: A Boolean Function Perspective. Advances in Neural Information Processing Systems, 35:19398–19413, 2022.
  62. Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising. IEEE Transactions on Image Processing, 26(7):3142–3155, 2017.
Citations (9)

Summary

We haven't generated a summary for this paper yet.