Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 69 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 42 tok/s Pro
GPT-5 High 41 tok/s Pro
GPT-4o 120 tok/s Pro
Kimi K2 191 tok/s Pro
GPT OSS 120B 459 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Spectrum Extraction and Clipping for Implicitly Linear Layers (2402.16017v2)

Published 25 Feb 2024 in cs.LG and cs.CV

Abstract: We show the effectiveness of automatic differentiation in efficiently and correctly computing and controlling the spectrum of implicitly linear operators, a rich family of layer types including all standard convolutional and dense layers. We provide the first clipping method which is correct for general convolution layers, and illuminate the representational limitation that caused correctness issues in prior work. We study the effect of the batch normalization layers when concatenated with convolutional layers and show how our clipping method can be applied to their composition. By comparing the accuracy and performance of our algorithms to the state-of-the-art methods, using various experiments, we show they are more precise and efficient and lead to better generalization and adversarial robustness. We provide the code for using our methods at https://github.com/Ali-E/FastClip.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Existence, stability and scalability of orthogonal convolutional neural networks. The Journal of Machine Learning Research, 23(1):15743–15798, 2022.
  2. Spectrally-normalized margin bounds for neural networks. Advances in neural information processing systems, 30, 2017.
  3. Revisiting batch normalization for improving corruption robustness. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp.  494–503, 2021.
  4. Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp), pp. 39–57. Ieee, 2017.
  5. Parseval networks: Improving robustness to adversarial examples. In International Conference on Machine Learning, pp. 854–863. PMLR, 2017.
  6. Efficient bound of lipschitz constant for convolutional layers by gram iteration. arXiv preprint arXiv:2305.16173, 2023.
  7. Generalizable adversarial training via spectral normalization. arXiv preprint arXiv:1811.07457, 2018.
  8. Batch normalization is a cause of adversarial vulnerability. arXiv preprint arXiv:1905.02161, 2019.
  9. Regularisation of neural networks by enforcing lipschitz continuity. Machine Learning, 110(2):393–416, 2021.
  10. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  770–778, 2016.
  11. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, pp. 448–456. PMLR, 2015.
  12. Anil K Jain. Fundamentals of digital image processing. Prentice-Hall, Inc., 1989.
  13. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  14. Learning multiple layers of features from tiny images. 2009.
  15. Yann LeCun. The mnist database of handwritten digits. http://yann. lecun. com/exdb/mnist/, 1998.
  16. Hessian schatten-norm regularization for linear inverse problems. IEEE transactions on image processing, 22(5):1873–1888, 2013.
  17. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
  18. Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957, 2018.
  19. Yousef Saad. Numerical methods for large eigenvalue problems: revised edition. SIAM, 2011.
  20. The singular values of convolutional layers. arXiv preprint arXiv:1805.10408, 2018.
  21. Towards practical control of singular values of convolutional layers. Advances in Neural Information Processing Systems, 35:10918–10930, 2022.
  22. Skew orthogonal convolutions. In International Conference on Machine Learning, pp. 9756–9766. PMLR, 2021.
  23. Improved deterministic l2 robustness on cifar-10 and cifar-100. arXiv preprint arXiv:2108.04062, 2021.
  24. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.
  25. Orthogonalizing convolutional layers with the cayley transform. arXiv preprint arXiv:2104.07167, 2021.
  26. Lipschitz regularity of deep neural networks: analysis and efficient estimation. Advances in Neural Information Processing Systems, 31, 2018.
  27. Evaluating the robustness of neural networks: An extreme value theory approach. arXiv preprint arXiv:1801.10578, 2018.
  28. Intriguing properties of adversarial training at scale. arXiv preprint arXiv:1906.03787, 2019.
  29. Lot: Layer-wise orthogonal training on improving l2 certified robustness. Advances in Neural Information Processing Systems, 35:18904–18915, 2022.
  30. Deep layer aggregation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  2403–2412, 2018.
  31. Constructing orthogonal convolutions in an explicit manner. In International Conference on Learning Representations, 2021.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube