Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 158 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 29 tok/s Pro
GPT-5 High 29 tok/s Pro
GPT-4o 80 tok/s Pro
Kimi K2 174 tok/s Pro
GPT OSS 120B 436 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Depth-wise Decomposition for Accelerating Separable Convolutions in Efficient Convolutional Neural Networks (1910.09455v3)

Published 21 Oct 2019 in cs.CV

Abstract: Very deep convolutional neural networks (CNNs) have been firmly established as the primary methods for many computer vision tasks. However, most state-of-the-art CNNs are large, which results in high inference latency. Recently, depth-wise separable convolution has been proposed for image recognition tasks on computationally limited platforms such as robotics and self-driving cars. Though it is much faster than its counterpart, regular convolution, accuracy is sacrificed. In this paper, we propose a novel decomposition approach based on SVD, namely depth-wise decomposition, for expanding regular convolutions into depthwise separable convolutions while maintaining high accuracy. We show our approach can be further generalized to the multi-channel and multi-layer cases, based on Generalized Singular Value Decomposition (GSVD) [59]. We conduct thorough experiments with the latest ShuffleNet V2 model [47] on both random synthesized dataset and a large-scale image recognition dataset: ImageNet [10]. Our approach outperforms channel decomposition [73] on all datasets. More importantly, our approach improves the Top-1 accuracy of ShuffleNet V2 by ~2%.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (77)
  1. Tensorflow: a system for large-scale machine learning. In OSDI, volume 16, pages 265–283, 2016.
  2. Learning the number of neurons in deep networks. In Advances in Neural Information Processing Systems, pages 2262–2270, 2016.
  3. Structured pruning of deep convolutional neural networks. arXiv preprint arXiv:1512.08571, 2015.
  4. Compact deep convolutional neural networks with coarse pruning. arXiv preprint arXiv:1610.09639, 2016.
  5. Lcnn: Lookup-based convolutional neural network. arXiv preprint arXiv:1611.06473, 2016.
  6. A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282, 2017.
  7. cudnn: Efficient primitives for deep learning. arXiv preprint arXiv:1410.0759, 2014.
  8. François Chollet. Xception: Deep learning with depthwise separable convolutions. arXiv preprint, pages 1610–02357, 2017.
  9. Binarynet: Training deep neural networks with weights and activations constrained to+ 1 or-1. arXiv preprint arXiv:1602.02830, 2016.
  10. Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pages 248–255. IEEE, 2009.
  11. Exploiting linear structure within convolutional networks for efficient evaluation. In Advances in Neural Information Processing Systems, pages 1269–1277, 2014.
  12. Motion prediction of traffic actors for autonomous driving using deep convolutional networks. arXiv preprint arXiv:1808.05819, 2018.
  13. Ross Girshick. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, pages 1440–1448, 2015.
  14. Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115, 2014.
  15. Network decoupling: From regular to depthwise separable convolutions. arXiv preprint arXiv:1808.05517, 2018.
  16. Dynamic network surgery for efficient dnns. In Advances In Neural Information Processing Systems, pages 1379–1387, 2016.
  17. Eie: efficient inference engine on compressed deep neural network. In Proceedings of the 43rd International Symposium on Computer Architecture, pages 243–254. IEEE Press, 2016.
  18. Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. CoRR, abs/1510.00149, 2, 2015.
  19. Learning both weights and connections for efficient neural network. In Advances in Neural Information Processing Systems, pages 1135–1143, 2015.
  20. Hypercolumns for object segmentation and fine-grained localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 447–456, 2015.
  21. Second order derivatives for network pruning: Optimal brain surgeon. Morgan Kaufmann, 1993.
  22. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.
  23. Adc: Automated deep compression and acceleration with reinforcement learning. arXiv preprint arXiv:1802.03494, 2018.
  24. Amc: Automl for model compression and acceleration on mobile devices. In Proceedings of the European Conference on Computer Vision (ECCV), pages 784–800, 2018.
  25. Addressnet: Shift-based primitives for efficient convolutional neural networks. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1213–1222. IEEE, 2019.
  26. Softer-nms: Rethinking bounding box regression for accurate object detection. arXiv preprint arXiv:1809.08545, 2018.
  27. Channel pruning for accelerating very deep neural networks. In International Conference on Computer Vision (ICCV), volume 2, 2017.
  28. Bounding box regression with uncertainty for accurate object detection. In 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019.
  29. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
  30. Network trimming: A data-driven neuron pruning approach towards efficient deep architectures. arXiv preprint arXiv:1607.03250, 2016.
  31. Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866, 2014.
  32. Compression of deep convolutional neural networks for fast and low power mobile applications. arXiv preprint arXiv:1511.06530, 2015.
  33. Lit: Block-wise intermediate representation training for model compression. arXiv preprint arXiv:1810.01937, 2018.
  34. Visual genome: Connecting language and vision using crowdsourced dense image annotations. International Journal of Computer Vision, 123(1):32–73, 2017.
  35. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.
  36. Pack and detect: Fast object detection in videos using region-of-interest packing. In Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, pages 150–156. ACM, 2019.
  37. Andrew Lavin. Fast algorithms for convolutional neural networks. arXiv preprint arXiv:1509.09308, 2015.
  38. Speeding-up convolutional neural networks using fine-tuned cp-decomposition. arXiv preprint arXiv:1412.6553, 2014.
  39. Fast convnets using group-wise brain damage. arXiv preprint arXiv:1506.02515, 2015.
  40. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
  41. Optimal brain damage. In NIPs, volume 2, pages 598–605, 1989.
  42. Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710, 2016.
  43. Single image super-resolution via a lightweight residual convolutional neural network. arXiv preprint arXiv:1703.08173, 2017.
  44. Microsoft coco: Common objects in context. In European conference on computer vision, pages 740–755. Springer, Cham, 2014.
  45. Sparse convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 806–814, 2015.
  46. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015.
  47. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In The European Conference on Computer Vision (ECCV), September 2018.
  48. Camera placement based on vehicle traffic for better city security surveillance. IEEE Intelligent Systems, 33(4):49–61, 2018.
  49. Diversity networks. arXiv preprint arXiv:1511.05077, 2015.
  50. Fast training of convolutional networks through ffts. arXiv preprint arXiv:1312.5851, 2013.
  51. Scalable parallel programming with cuda. In ACM SIGGRAPH 2008 classes, page 16. ACM, 2008.
  52. Mobinet: A mobile binary network for image classification. In 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2020.
  53. Channel-level acceleration of deep face representations. IEEE Access, 3:2163–2175, 2015.
  54. Xnor-net: Imagenet classification using binary convolutional neural networks. In European Conference on Computer Vision, pages 525–542. Springer, 2016.
  55. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pages 91–99, 2015.
  56. Rigid-motion scattering for image classification. PhD thesis, Citeseer, 2014.
  57. Data-free parameter pruning for deep neural networks. arXiv preprint arXiv:1507.06149, 2015.
  58. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.
  59. Regularized linear and kernel redundancy analysis. Computational Statistics & Data Analysis, 52(1):394–405, 2007.
  60. Mnasnet: Platform-aware neural architecture search for mobile. arXiv preprint arXiv:1807.11626, 2018.
  61. Fast convolutional nets with fbfft: A gpu performance evaluation. arXiv preprint arXiv:1412.7580, 2014.
  62. Prediction-tracking-segmentation. arXiv preprint arXiv:1904.03280, 2019.
  63. Vertical jump height estimation algorithm based on takeoff and landing identification via foot-worn inertial sensing. Journal of biomechanical engineering, 140(3):034502, 2018.
  64. Haq: Hardware-aware automated quantization. arXiv preprint arXiv:1811.08886, 2018.
  65. Learning structured sparsity in deep neural networks. In Advances In Neural Information Processing Systems, pages 2074–2082, 2016.
  66. Shift: A zero flop, zero parameter alternative to spatial convolutions. arXiv preprint arXiv:1711.08141, 2017.
  67. Yuxin Wu et al. Tensorpack. github.com/tensorpack/, 2016.
  68. Validation of a smart shoe for estimating foot progression angle during walking gait. Journal of biomechanics, 61:193–198, 2017.
  69. Restructuring of deep neural network acoustic models with singular value decomposition. In INTERSPEECH, pages 2365–2369, 2013.
  70. Designing energy-efficient convolutional neural networks using energy-aware pruning. arXiv preprint arXiv:1611.05128, 2016.
  71. Visualizing and understanding convolutional networks. In European conference on computer vision, pages 818–833. Springer, 2014.
  72. Shufflenet: An extremely efficient convolutional neural network for mobile devices. arxiv 2017. arXiv preprint arXiv:1707.01083.
  73. Accelerating very deep convolutional networks for classification and detection. IEEE transactions on pattern analysis and machine intelligence, 38(10):1943–1955, 2016.
  74. Scene parsing through ade20k dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017.
  75. Less is more: Towards compact cnns. In European Conference on Computer Vision, pages 662–677. Springer International Publishing, 2016.
  76. Feature selective anchor-free module for single-shot object detection. arXiv preprint arXiv:1903.00621, 2019.
  77. Learning transferable architectures for scalable image recognition. arXiv preprint arXiv:1707.07012, 2(6), 2017.
Citations (12)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.