LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from Scratch (2309.14157v1)
Abstract: Structured pruning is a commonly used convolutional neural network (CNN) compression approach. Pruning rate setting is a fundamental problem in structured pruning. Most existing works introduce too many additional learnable parameters to assign different pruning rates across different layers in CNN or cannot control the compression rate explicitly. Since too narrow network blocks information flow for training, automatic pruning rate setting cannot explore a high pruning rate for a specific layer. To overcome these limitations, we propose a novel framework named Layer Adaptive Progressive Pruning (LAPP), which gradually compresses the network during initial training of a few epochs from scratch. In particular, LAPP designs an effective and efficient pruning strategy that introduces a learnable threshold for each layer and FLOPs constraints for network. Guided by both task loss and FLOPs constraints, the learnable thresholds are dynamically and gradually updated to accommodate changes of importance scores during training. Therefore the pruning strategy can gradually prune the network and automatically determine the appropriate pruning rates for each layer. What's more, in order to maintain the expressive power of the pruned layer, before training starts, we introduce an additional lightweight bypass for each convolutional layer to be pruned, which only adds relatively few additional burdens. Our method demonstrates superior performance gains over previous compression methods on various datasets and backbone architectures. For example, on CIFAR-10, our method compresses ResNet-20 to 40.3% without accuracy drop. 55.6% of FLOPs of ResNet-18 are reduced with 0.21% top-1 accuracy increase and 0.40% top-5 accuracy increase on ImageNet.
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
- J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 779–788.
- T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2980–2988.
- R. T. Marriott, S. Romdhani, and L. Chen, “A 3d gan for improved large-pose facial recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 13 445–13 455.
- S. Li, J. Xu, X. Xu, P. Shen, S. Li, and B. Hooi, “Spherical confidence learning for face recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 15 629–15 637.
- Y. Chen, Z. Zhang, C. Yuan, B. Li, Y. Deng, and W. Hu, “Channel-wise topology refinement graph convolution for skeleton-based action recognition,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 13 359–13 368.
- Y. Nirkin, L. Wolf, and T. Hassner, “HyperSeg: Patch-wise hypernetwork for real-time semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 4061–4070.
- H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, “Pruning filters for efficient convnets,” arXiv preprint arXiv:1608.08710, 2016.
- J. J. M. Ople, T.-M. Huang, M.-C. Chiu, Y.-L. Chen, and K.-L. Hua, “Adjustable model compression using multiple genetic algorithms,” IEEE Transactions on Multimedia, 2021.
- Y. Wang, X. Zhang, L. Xie, J. Zhou, H. Su, B. Zhang, and X. Hu, “Pruning from scratch,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 07, 2020, pp. 12 273–12 280.
- V. Lebedev, Y. Ganin, M. Rakhuba, I. Oseledets, and V. Lempitsky, “Speeding-up convolutional neural networks using fine-tuned CP-decomposition,” in Proceedings of the International Conference on Learning Representations, 2015.
- A.-H. Phan, K. Sobolev, K. Sozykin, D. Ermilov, J. Gusak, P. Tichavskỳ, V. Glukhov, I. Oseledets, and A. Cichocki, “Stable low-rank tensor decomposition for compression of convolutional neural network,” in Proceedings of the European Conference on Computer Vision. Springer, 2020, pp. 522–539.
- J. Kossaifi, A. Toisoul, A. Bulat, Y. Panagakis, T. M. Hospedales, and M. Pantic, “Factorized higher-order cnns with an application to spatio-temporal emotion estimation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 6060–6069.
- G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” arXiv preprint arXiv:1503.02531, 2015.
- D. Y. Park, M.-H. Cha, D. Kim, B. Han et al., “Learning student-friendly teacher networks for knowledge distillation,” in Advances in Neural Information Processing Systems, vol. 34, 2021, pp. 13 292–13 303.
- P. K. Sharma, A. Abraham, and V. N. Rajendiran, “A generalized zero-shot quantization of deep convolutional neural networks via learned weights statistics,” IEEE Transactions on Multimedia, 2021.
- Z. Li, B. Ni, T. Li, X. Yang, W. Zhang, and W. Gao, “Residual quantization for low bit-width neural networks,” IEEE Transactions on Multimedia, 2021.
- W. Duan, Z. Liu, C. Jia, S. Wang, S. Ma, and W. Gao, “Differential weight quantization for multi-model compression,” IEEE Transactions on Multimedia, 2022.
- K. Han, Y. Wang, C. Xu, J. Guo, C. Xu, E. Wu, and Q. Tian, “Ghostnets on heterogeneous devices via cheap operations,” International Journal of Computer Vision, vol. 130, no. 4, pp. 1050–1069, 2022.
- Q. Zhang, Z. Jiang, Q. Lu, Z. Zeng, S.-H. Gao, and A. Men, “Split to Be Slim: an overlooked redundancy in vanilla convolution,” in Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, 2021, pp. 3195–3201.
- J. Chen, S.-h. Kao, H. He, W. Zhuo, S. Wen, C.-H. Lee, and S.-H. G. Chan, “Run, Don’t Walk: Chasing higher flops for faster neural networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 12 021–12 031.
- X. Ding, X. Zhou, Y. Guo, J. Han, J. Liu et al., “Global sparse momentum sgd for pruning very deep neural networks,” in Advances in Neural Information Processing Systems, vol. 32, 2019.
- X. Dong, S. Chen, and S. Pan, “Learning to prune deep neural networks via layer-wise optimal brain surgeon,” in Advances in Neural Information Processing Systems, vol. 30, 2017.
- X. Ding, T. Hao, J. Tan, J. Liu, J. Han, Y. Guo, and G. Ding, “ResRep: Lossless cnn pruning via decoupling remembering and forgetting,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 4510–4520.
- M. Lin, R. Ji, Y. Wang, Y. Zhang, B. Zhang, Y. Tian, and L. Shao, “Hrank: Filter pruning using high-rank feature map,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 1529–1538.
- Y. He, G. Kang, X. Dong, Y. Fu, and Y. Yang, “Soft filter pruning for accelerating deep convolutional neural networks,” in Proceedings of the International Joint Conference on Artificial Intelligence, 2018.
- Y. He, P. Liu, Z. Wang, Z. Hu, and Y. Yang, “Filter pruning via geometric median for deep convolutional neural networks acceleration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 4340–4349.
- H. Zhang, L. Liu, B. Kang, and N. Zheng, “Hierarchical model compression via shape-edge representation of feature maps—an enlightenment from the primate visual system,” IEEE Transactions on Multimedia, 2022.
- Z. Liu, J. Li, Z. Shen, G. Huang, S. Yan, and C. Zhang, “Learning efficient convolutional networks through network slimming,” in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2736–2744.
- A. Kusupati, V. Ramanujan, R. Somani, M. Wortsman, P. Jain, S. Kakade, and A. Farhadi, “Soft threshold weight reparameterization for learnable sparsity,” in Proceedings of the International Conference on Machine Learning. PMLR, 2020, pp. 5544–5555.
- S. Gao, F. Huang, W. Cai, and H. Huang, “Network pruning via performance maximization,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 9270–9280.
- S. Gao, F. Huang, J. Pei, and H. Huang, “Discrete model compression with resource constraint for deep neural networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 1899–1908.
- M. Shen, P. Molchanov, H. Yin, and J. M. Alvarez, “When to prune? a policy towards early structural pruning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 12 247–12 256.
- M. Nonnenmacher, T. Pfeil, I. Steinwart, and D. Reeb, “SOSP: Efficiently capturing global correlations by second-order structured pruning,” in Proceedings of the International Conference on Learning Representations, 2021.
- M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted residuals and linear bottlenecks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Rcognition, 2018, pp. 4510–4520.
- S. Yu, A. Mazaheri, and A. Jannesari, “Topology-aware network pruning using multi-stage graph embedding and reinforcement learning,” in Proceedings of the International Conference on Machine Learning. PMLR, 2022, pp. 25 656–25 667.
- M. Alwani, Y. Wang, and V. Madhavan, “DECORE: Deep compression with reinforcement learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 12 349–12 359.
- C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1–9.
- G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 4700–4708.
- Z. Zhuang, M. Tan, B. Zhuang, J. Liu, Y. Guo, Q. Wu, J. Huang, and J. Zhu, “Discrimination-aware channel pruning for deep neural networks,” in Advances in Neural Information Processing Systems, vol. 31, 2018.
- N. Lee, T. Ajanthan, and P. Torr, “SNIP: Single-shot network pruning based on connection sensitivity,” in Proceedings of the International Conference on Learning Representations, 2018.
- S. Hayou, J.-F. Ton, A. Doucet, and Y. W. Teh, “Robust pruning at initialization,” in Proceedings of the International Conference on Learning Representations, 2020.
- C. Wang, G. Zhang, and R. Grosse, “Picking winning tickets before training by preserving gradient flow,” in Proceedings of the International Conference on Learning Representations, 2019.
- X. Ruan, Y. Liu, B. Li, C. Yuan, and W. Hu, “DPFPS: dynamic and progressive filter pruning for compressing convolutional neural networks from scratch,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 3, 2021, pp. 2495–2503.
- J. Frankle, G. K. Dziugaite, D. Roy, and M. Carbin, “Linear mode connectivity and the lottery ticket hypothesis,” in Proceedings of the International Conference on Machine Learning. PMLR, 2020, pp. 3259–3269.
- Y. He, Y. Ding, P. Liu, L. Zhu, H. Zhang, and Y. Yang, “Learning filter pruning criteria for deep convolutional neural networks acceleration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 2009–2018.
- Z. Wang, W. Hong, Y.-P. Tan, and J. Yuan, “Pruning 3d filters for accelerating 3d convnets,” IEEE Transactions on Multimedia, vol. 22, no. 8, pp. 2126–2137, 2019.
- N. J. Kim and H. Kim, “FP-AGL: Filter pruning with adaptive gradient learning for accelerating deep convolutional neural networks,” IEEE Transactions on Multimedia, 2022.
- T. Zhuang, Z. Zhang, Y. Huang, X. Zeng, K. Shuang, and X. Li, “Neuron-level structured pruning using polarization regularizer,” in Advances in Neural Information Processing Systems, vol. 33, 2020, pp. 9865–9877.
- Y. Sui, M. Yin, Y. Xie, H. Phan, S. Aliari Zonouz, and B. Yuan, “CHIP: CHannel independence-based pruning for compact neural networks,” in Advances in Neural Information Processing Systems, vol. 34, 2021, pp. 24 604–24 616.
- S. Elkerdawy, M. Elhoushi, H. Zhang, and N. Ray, “Fire Together Wire Together: A dynamic pruning approach with self-supervised mask prediction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 12 454–12 463.
- K. Guo, X. Xie, X. Xu, and X. Xing, “Compressing by learning in a low-rank and sparse decomposition form,” IEEE Access, vol. 7, pp. 150 823–150 832, 2019.
- Y. Li, S. Lin, J. Liu, Q. Ye, M. Wang, F. Chao, F. Yang, J. Ma, Q. Tian, and R. Ji, “Towards compact cnns via collaborative compression,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 6438–6447.
- Y. Li, S. Gu, C. Mayer, L. V. Gool, and R. Timofte, “Group Sparsity: The hinge between filter pruning and decomposition for network compression,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 8018–8027.
- Y. Wang, Y. Lu, and T. Blankevoort, “Differentiable joint pruning and quantization for hardware efficiency,” in Proceedings of the European Conference on Computer Vision. Springer, 2020, pp. 259–277.
- M. Van Baalen, C. Louizos, M. Nagel, R. A. Amjad, Y. Wang, T. Blankevoort, and M. Welling, “Bayesian Bits: Unifying quantization and pruning,” in Advances in Neural Information Processing Systems, vol. 33, 2020, pp. 5741–5752.
- S. Wang, J. Chen, C. Li, J. Zhu, and B. Zhang, “Fast lossless neural compression with integer-only discrete flows,” in Proceedings of the International Conference on Machine Learning. PMLR, 2022, pp. 22 562–22 575.
- S. Li, M. Lin, Y. Wang, C. Fei, L. Shao, and R. Ji, “Learning efficient gans for image translation via differentiable masks and co-attention distillation,” IEEE Transactions on Multimedia, 2022.
- W. Zou, Y. Wang, X. Fu, and Y. Cao, “Dreaming to prune image deraining networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 6023–6032.
- Y. Liu, Z. Shu, Y. Li, Z. Lin, F. Perazzi, and S.-Y. Kung, “Content-aware gan compression,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 12 156–12 166.
- J. Park and A. No, “Prune your model before distill it,” in Proceedings of the European Conference on Computer Vision. Springer, 2022, pp. 120–136.
- Y. Bengio, N. Léonard, and A. Courville, “Estimating or propagating gradients through stochastic neurons for conditional computation,” arXiv preprint arXiv:1308.3432, 2013.
- A. Krizhevsky, “Learning multiple layers of features from tiny images,” Master’s thesis, University of Tront, 2009.
- O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein et al., “Imagenet large scale visual recognition challenge,” International journal of computer vision, vol. 115, no. 3, pp. 211–252, 2015.
- A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al., “Pytorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems, vol. 32, 2019.
- A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017.
- Pucheng Zhai (1 paper)
- Kailing Guo (13 papers)
- Fang Liu (800 papers)
- Xiaofen Xing (29 papers)
- Xiangmin Xu (54 papers)