Iterative Filter Pruning for Concatenation-based CNN Architectures (2405.03715v1)
Abstract: Model compression and hardware acceleration are essential for the resource-efficient deployment of deep neural networks. Modern object detectors have highly interconnected convolutional layers with concatenations. In this work, we study how pruning can be applied to such architectures, exemplary for YOLOv7. We propose a method to handle concatenation layers, based on the connectivity graph of convolutional layers. By automating iterative sensitivity analysis, pruning, and subsequent model fine-tuning, we can significantly reduce model size both in terms of the number of parameters and FLOPs, while keeping comparable model accuracy. Finally, we deploy pruned models to FPGA and NVIDIA Jetson Xavier AGX. Pruned models demonstrate a 2x speedup for the convolutional layers in comparison to the unpruned counterparts and reach real-time capability with 14 FPS on FPGA. Our code is available at https://github.com/fzi-forschungszentrum-informatik/iterative-yolo-pruning.
- T. Fleck, S. Pavlitska, S. Nitzsche, B. Pachideh, F. Peccia, S. H. Ahmed, S. M. Meyer, M. Richter, K. Broertjes, E. Neftci et al., “Low-power traffic surveillance using multiple rgb and event cameras: A survey,” in IEEE International Smart Cities Conference (ISC2). IEEE, 2023.
- J. Frankle and M. Carbin, “The lottery ticket hypothesis: Finding sparse, trainable neural networks,” in International Conference on Learning Representations (ICLR), 2019.
- S. Pavlitska, H. Grolig, and J. M. Zöllner, “Relationship between model compression and adversarial robustness: A review of current evidence,” in Symposium Series on Computational Intelligence (SSCI). IEEE, 2023.
- S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding,” in International Conference on Learning Representations (ICLR), 2016.
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in International Conference on Learning Representations (ICLR), 2015.
- H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, “Pruning filters for efficient convnets,” in International Conference on Learning Representations (ICLR). OpenReview.net, 2017.
- C. Wang, A. Bochkovskiy, and H. M. Liao, “Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,” in Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2023.
- C. Wang, H. M. Liao, and I. Yeh, “Designing network design strategies through gradient path analysis,” Journal of Information Science and Engineering, 2023.
- G. Jocher, A. Chaurasia, and J. Qiu, “Yolo by ultralytics (version 8.0.0) [computer software],” https://github.com/ultralytics/ultralytics, 2023.
- J. Zhang, P. Wang, Z. Zhao, and F. Su, “Pruned-yolo: Learning efficient object detector using model pruning,” in Artificial Neural Networks and Machine Learning (ICANN). Springer International Publishing, 2021.
- C. Zhang, S. Bengio, M. Hardt, B. Recht, and O. Vinyals, “Understanding deep learning requires rethinking generalization,” in International Conference on Learning Representations (ICLR), 2017.
- S. Han, X. Liu, H. Mao, J. Pu, A. Pedram, M. A. Horowitz, and W. J. Dally, “Eie: Efficient inference engine on compressed deep neural network,” in International Symposium on Computer Architecture. IEEE Computer Society, 2016.
- D. H. Le and B.-S. Hua, “Network pruning that matters: A case study on retraining variants,” in International Conference on Learning Representations (ICLR), 2021.
- Y. Lecun, J. Denker, and S. Solla, “Optimal brain damage,” vol. 2, 1989.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2016.
- A. Krizhevsky, “Learning multiple layers of features from tiny images,” Tech. Rep., 2009.
- A. Renda, J. Frankle, and M. Carbin, “Comparing rewinding and fine-tuning in neural network pruning,” 2020.
- W. Wen, C. Wu, Y. Wang, Y. Chen, and H. Li, “Learning structured sparsity in deep neural networks,” 2016.
- S. Han, J. Pool, J. Tran, and W. Dally, “Learning both weights and connections for efficient neural network,” in Advances in Neural Information Processing Systems (NIPS), C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, Eds., vol. 28. Curran Associates, Inc., 2015.
- P. Zhang, Y. Zhong, and X. Li, “SlimYOLOv3: Narrower, faster and better for real-time UAV applications,” in International Conference on Computer Vision (ICCV) - Workshops. IEEE, 2019.
- Z. Liu, J. Li, Z. Shen, G. Huang, S. Yan, and C. Zhang, “Learning efficient convolutional networks through network slimming,” 2017.
- J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” 2018.
- P. Zhu, L. Wen, X. Bian, L. Haibin, and Q. Hu, “Vision meets drones: A challenge,” arXiv preprint:1804.07437, 2018.
- T.-Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P. Perona, D. Ramanan, C. L. Zitnick, and P. Dollár, “Microsoft coco: Common objects in context,” 2015.
- H. Ahn, S. Son, J. Roh, H. Baek, S. Lee, Y. Chung, and D. Park, “SAFP-YOLO: Enhanced Object Detection Speed Using Spatial Attention-Based Filter Pruning,” Applied Sciences, vol. 13, no. 20, p. 11237, Oct. 2023. [Online]. Available: https://www.mdpi.com/2076-3417/13/20/11237
- Z. Li, Y. Wang, K. Chen, and Z. Yu, “Channel Pruned YOLOv5-based Deep Learning Approach for Rapid and Accurate Outdoor Obstacles Detection,” Aug. 2022, arXiv:2204.13699 [cs]. [Online]. Available: http://arxiv.org/abs/2204.13699
- M. Jani, J. Fayyad, Y. A. Younes, and H. Najjaran, “Model compression methods for yolov5: A review,” CoRR, vol. abs/2307.11904, 2023.
- J. Zhang, R. Zhang, X. Shu, L. Yu, and X. Xu, “Channel pruning-based yolov7 deep learning algorithm for identifying trolley codes,” Applied Sciences, vol. 13, no. 18, p. 10202, 2023.
- G. Fang, X. Ma, M. Song, M. B. Mi, and X. Wang, “Depgraph: Towards any structural pruning,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 16 091–16 101.
- S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” 2015.
- L. Mao, “Neural network batch normalization fusion,” 2021.
- H. Genc, S. Kim, A. Amid, A. Haj-Ali, V. Iyer, P. Prakash, J. Zhao, D. Grubb, H. Liew, H. Mao, A. Ou, C. Schmidt, S. Steffl, J. Wright, I. Stoica, J. Ragan-Kelley, K. Asanovic, B. Nikolic, and Y. S. Shao, “Gemmini: Enabling systematic deep-learning architecture evaluation via full-stack integration,” in Proceedings of the 58th Annual Design Automation Conference (DAC), 2021.
- F. N. Peccia and O. Bringmann, “Integration of a systolic array based hardware accelerator into a DNN operator auto-tuning framework,” CoRR, vol. abs/2212.03034, 2022.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.