Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Structurally Prune Anything: Any Architecture, Any Framework, Any Time (2403.18955v1)

Published 3 Mar 2024 in cs.LG and cs.CV

Abstract: Neural network pruning serves as a critical technique for enhancing the efficiency of deep learning models. Unlike unstructured pruning, which only sets specific parameters to zero, structured pruning eliminates entire channels, thus yielding direct computational and storage benefits. However, the diverse patterns for coupling parameters, such as residual connections and group convolutions, the diverse deep learning frameworks, and the various time stages at which pruning can be performed make existing pruning methods less adaptable to different architectures, frameworks, and pruning criteria. To address this, we introduce Structurally Prune Anything (SPA), a versatile structured pruning framework that can prune neural networks with any architecture, from any framework, and at any stage of training. SPA leverages a standardized computational graph and ONNX representation to prune diverse neural network architectures without the need for manual intervention. SPA employs a group-level importance estimation method, which groups dependent computational operators, estimates their importance, and prunes unimportant coupled channels. This enables the transfer of various existing pruning criteria into a structured group style. As a result, SPA supports pruning at any time, either before training, after training with fine-tuning, or after training without fine-tuning. In the context of the latter, we introduce Optimal Brain SPA (OBSPA), an algorithm that achieves state-of-the-art pruning results needing neither fine-tuning nor calibration data. In extensive experiments, SPA shows competitive to state-of-the-art pruning performance across various architectures, from popular frameworks, at different pruning times.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. What is the state of neural network pruning? ArXiv, abs/2003.03033, 2020.
  2. Only train once: A one-shot neural network training and pruning framework. In Thirty-Fifth Conference on Neural Information Processing Systems, 2021.
  3. Otov2: Automatic, generic, user-friendly. In International Conference on Learning Representations, 2023.
  4. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp.  248–255. Ieee, 2009.
  5. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), June 2019.
  6. Dhar, P. The carbon impact of artificial intelligence. Nature Machine Intelligence, 2:423 – 425, 2020. URL https://api.semanticscholar.org/CorpusID:225488526.
  7. Resrep: Lossless cnn pruning via decoupling remembering and forgetting. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  4510–4520, 2021.
  8. Learning to prune deep neural networks via layer-wise optimal brain surgeon. In NIPS, 2017.
  9. An image is worth 16x16 words: Transformers for image recognition at scale. ArXiv, abs/2010.11929, 2020.
  10. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=YicbFdNTTy.
  11. Rigging the lottery: Making all tickets winners, 2020. URL https://openreview.net/forum?id=ryg7vA4tPB.
  12. Depgraph: Towards any structural pruning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  16091–16101, 2023.
  13. Sparsegpt: Massive language models can be accurately pruned in one-shot. ArXiv, abs/2301.00774, 2023.
  14. Optimal Brain Compression: a framework for accurate post-training quantization and pruning. Advances in Neural Information Processing Systems, 36, 2022.
  15. Learning both weights and connections for efficient neural network. In NIPS, 2015.
  16. Second order derivatives for network pruning: Optimal brain surgeon. In NIPS, 1992.
  17. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  770–778, 2015.
  18. Structured pruning for deep convolutional neural networks: A survey. ArXiv, abs/2303.00566, 2023.
  19. Channel pruning for accelerating very deep neural networks. 2017 IEEE International Conference on Computer Vision (ICCV), pp.  1398–1406, 2017.
  20. Soft filter pruning for accelerating deep convolutional neural networks. In International Joint Conference on Artificial Intelligence, 2018a.
  21. Filter pruning via geometric median for deep convolutional neural networks acceleration. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  4335–4344, 2018b.
  22. Natural adversarial examples. CVPR, 2021.
  23. Mobilenets: Efficient convolutional neural networks for mobile vision applications. ArXiv, abs/1704.04861, 2017.
  24. Howard, J. Imagewang. URL https://github.com/fastai/imagenette/.
  25. Densely connected convolutional networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  2261–2269, 2016.
  26. Data-driven sparse structure selection for deep neural networks. ArXiv, abs/1707.01213, 2017. URL https://api.semanticscholar.org/CorpusID:575794.
  27. Thinet: A filter level pruning method for deep neural network compression. In ICCV, pp.  5058–5066, 2017.
  28. Cifar-10 (canadian institute for advanced research). a. URL http://www.cs.toronto.edu/~kriz/cifar.html.
  29. Cifar-100 (canadian institute for advanced research). b. URL http://www.cs.toronto.edu/~kriz/cifar.html.
  30. Post-training deep neural network pruning via layer-wise calibration. 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pp.  798–805, 2021.
  31. FFCV: Accelerating training by removing data bottlenecks. In Computer Vision and Pattern Recognition (CVPR), 2023. https://github.com/libffcv/ffcv/. commit xxxxxxx.
  32. Optimal brain damage. In NIPS, 1989.
  33. Snip: Single-shot network pruning based on connection sensitivity. In ICLR, 2019.
  34. Pruning filters for efficient convnets. ArXiv, abs/1608.08710, 2016.
  35. Hrank: Filter pruning using high-rank feature map. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  1526–1535, 2020.
  36. Group fisher pruning for practical network compression. In International Conference on Machine Learning, 2021.
  37. Learning efficient convolutional networks through network slimming. 2017 IEEE International Conference on Computer Vision (ICCV), pp.  2755–2763, 2017a.
  38. Learning efficient convolutional networks through network slimming. In ICCV, 2017b.
  39. A gradient flow framework for analyzing network pruning. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=rumv7QmLUue.
  40. DFPC: Data flow driven pruning of coupled channels without data. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=mhnHqRqcjYU.
  41. Winning the lottery ahead of time: Efficient early network pruning. In Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research. PMLR, 2022.
  42. Designing network design spaces. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  10425–10433, 2020.
  43. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations, 2015.
  44. Parsing With Compositional Vector Grammars. In EMNLP. 2013.
  45. Data-free parameter pruning for deep neural networks. In British Machine Vision Conference, 2015.
  46. EfficientNet: Rethinking model scaling for convolutional neural networks. In Chaudhuri, K. and Salakhutdinov, R. (eds.), Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pp.  6105–6114. PMLR, 09–15 Jun 2019. URL https://proceedings.mlr.press/v97/tan19a.html.
  47. Pruning via Iterative Ranking of Sensitivity Statistics. arXiv e-prints, art. arXiv:2006.00896, June 2020.
  48. Glue: A multi-task benchmark and analysis platform for natural language understanding. In 7th International Conference on Learning Representations, ICLR 2019, 2019.
  49. Picking winning tickets before training by preserving gradient flow. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=SkgsACVKPH.
  50. Autoprune: Automatic network pruning by regularizing auxiliary parameters. In Neural Information Processing Systems, 2019.
  51. Aggregated residual transformations for deep neural networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  5987–5995, 2016.
  52. Drawing early-bird tickets: Toward more efficient training of deep networks. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=BJxsrgStvr.
  53. Gate decorator: Global filter pruning method for accelerating deep convolutional neural networks. In Advances in Neural Information Processing Systems (NeurIPS), 2019.
  54. Nisp: Pruning networks using neuron importance score propagation. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  9194–9203, 2017.
  55. Wide residual networks. In BMVC, 2016.
  56. Discrimination-aware channel pruning for deep neural networks. In Neural Information Processing Systems, 2018. URL https://api.semanticscholar.org/CorpusID:53102564.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Xun Wang (96 papers)
  2. John Rachwan (4 papers)
  3. Stephan Günnemann (169 papers)
  4. Bertrand Charpentier (21 papers)
Citations (3)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets