Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An automated approach for improving the inference latency and energy efficiency of pretrained CNNs by removing irrelevant pixels with focused convolutions (2310.07782v1)

Published 11 Oct 2023 in cs.CV

Abstract: Computer vision often uses highly accurate Convolutional Neural Networks (CNNs), but these deep learning models are associated with ever-increasing energy and computation requirements. Producing more energy-efficient CNNs often requires model training which can be cost-prohibitive. We propose a novel, automated method to make a pretrained CNN more energy-efficient without re-training. Given a pretrained CNN, we insert a threshold layer that filters activations from the preceding layers to identify regions of the image that are irrelevant, i.e. can be ignored by the following layers while maintaining accuracy. Our modified focused convolution operation saves inference latency (by up to 25%) and energy costs (by up to 22%) on various popular pretrained CNNs, with little to no loss in accuracy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (18)
  1. C. Tung, A. Goel, X. Hu, N. Eliopoulos, E. S. Amobi, G. K. Thiruvathukal, V. Chaudhary, and Y.-H. Lu, “Irrelevant pixels are everywhere: Find and exclude them for more efficient computer vision,” in 2022 IEEE 4th International Conference on Artificial Intelligence Circuits and Systems (AICAS), pp. 340–343, IEEE, 2022.
  2. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, 2016.
  3. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
  4. Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie, “A convnet for the 2020s,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11976–11986, 2022.
  5. S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” Advances in Neural Information Processing Systems, vol. 28, 2015.
  6. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “Ssd: Single shot multibox detector,” in European Conference on Computer Vision (ECCV), pp. 21–37, Springer, 2016.
  7. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255, Ieee, 2009.
  8. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in European Conference on Computer Vision (ECCV), pp. 740–755, Springer, 2014.
  9. A. Goel, C. Tung, Y.-H. Lu, and G. K. Thiruvathukal, “A survey of methods for low-power deep learning and computer vision,” in 2020 IEEE 6th World Forum on Internet of Things (WF-IoT), pp. 1–6, IEEE, 2020.
  10. CRC Press, 2022.
  11. A. Skillman and T. Edso, “A technical overview of cortex-m55 and ethos-u55: Arm’s most capable processors for endpoint ai,” in 2020 IEEE Hot Chips 32 Symposium (HCS), pp. 1–20, IEEE Computer Society, 2020.
  12. M. Figurnov, M. D. Collins, Y. Zhu, L. Zhang, J. Huang, D. Vetrov, and R. Salakhutdinov, “Spatially adaptive computation time for residual networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1039–1048, 2017.
  13. T. Dubhir, M. Mishra, and R. Singhal, “Benchmarking of quantization libraries in popular frameworks,” in 2021 IEEE International Conference on Big Data (Big Data), pp. 3050–3055, IEEE, 2021.
  14. J. Kopf, X. Rong, and J.-B. Huang, “Robust consistent video depth estimation,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1611–1621, 2021.
  15. B. Kågström, P. Ling, and C. Van Loan, “Gemm-based level 3 blas: high-performance model implementations and performance evaluation benchmark,” ACM Transactions on Mathematical Software (TOMS), vol. 24, no. 3, pp. 268–302, 1998.
  16. T.-J. Yang, Y.-H. Chen, and V. Sze, “Designing energy-efficient convolutional neural networks using energy-aware pruning,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5687–5695, 2017.
  17. D. H. Lawrie, “Access and alignment of data in an array processor,” IEEE Transactions on Computers, vol. 100, no. 12, pp. 1145–1155, 1975.
  18. S. Teerapittayanon and B. McDanel, “Branchynet: Fast inference via early exiting from deep neural networks,” in 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 2464–2469, IEEE, 2016.

Summary

We haven't generated a summary for this paper yet.