Early-Exit with Class Exclusion for Efficient Inference of Neural Networks (2309.13443v2)
Abstract: Deep neural networks (DNNs) have been successfully applied in various fields. In DNNs, a large number of multiply-accumulate (MAC) operations are required to be performed, posing critical challenges in applying them in resource-constrained platforms, e.g., edge devices. To address this challenge, in this paper, we propose a class-based early-exit for dynamic inference. Instead of pushing DNNs to make a dynamic decision at intermediate layers, we take advantage of the learned features in these layers to exclude as many irrelevant classes as possible, so that later layers only have to determine the target class among the remaining classes. When only one class remains at a layer, this class is the corresponding classification result. Experimental results demonstrate the computational cost of DNNs in inference can be reduced significantly with the proposed early-exit technique. The codes can be found at https://github.com/HWAI-TUDa/EarlyClassExclusion.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 2016.
- M. Jiang, J. Wang, A. Eldebiky, X. Yin, C. Zhuo, I.-C. Lin, and G. L. Zhang, “Class-aware Pruning for Efficient Neural Networks,” Design, Automation & Test in Europe Conference & Exhibition (DATE), 2024.
- R. Petri, G. L. Zhang, Y. Chen, U. Schlichtmann, and B. Li, “Powerpruning: Selecting weights and activations for power-efficient neural network acceleration,” Design Automation Conference (DAC), 2023.
- B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko, “Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference,” IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 2018.
- W. Sun, G. L. Zhang, H. Gu, B. Li, and U. Schlichtmann, “Class-based Quantization for Neural Networks,” Design, Automation & Test in Europe Conference & Exhibition (DATE), 2023.
- G. Hinton, O. Vinyals, and J. Dean, “Distilling the Knowledge in a Neural Network,” Neural Information Processing Systems (NeurIPS), 2014.
- R. Qiu, A. Eldebiky, G. L. Zhang, X. Yin, C. Zhuo, U. Schlichtmann, and B. Li, “Oplixnet: Towards area-efficient optical split-complex networks with real-to-complex data assignment and knowledge distillation,” in Design, Automation & Test in Europe Conference & Exhibition (DATE), 2024.
- A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications,” arXiv preprint, 2017.
- W. Sun, G. L. Zhang, X. Yin, C. Zhuo, H. Gu, B. Lil, and U. Schlichtmann, “Steppingnet: A stepping neural network with incremental accuracy enhancement,” in Design, Automation & Test in Europe Conference & Exhibition (DATE), 2023.
- B. Zoph and Q. V. Le, “Neural Architecture Search with Reinforcement Learning,” International Conference on Machine Learning (ICLR), 2017.
- Y. Han, G. Huang, S. Song, L. Yang, H. Wang, and Y. Wang, “Dynamic Neural Networks: A Survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 44, pp. 7436–7456, 2021.
- S. Teerapittayanon, B. McDanel, and H.-T. Kung, “BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks,” International Conference on Pattern Recognition (ICPR), 2017.
- G. Huang, D. Chen, T. Li, F. Wu, L. Van Der Maaten, and K. Q. Weinberger, “Multi-scale Dense Networks for Resource Efficient Image Classification,” IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 2018.
- H. Li, H. Zhang, X. Qi, R. Yang, and G. Huang, “Improved Techniques for Training Adaptive Deep Networks,” International Conference on Computer Vision (ICCV), 2019.
- X. Dai, X. Kong, and T. Guo, “EPNet: Learning to Exit with Flexible Multi-Branch Network,” ACM International Conference on Information & Knowledge Management (CIKM), 2020.
- Y. Kaya, S. Hong, and T. Dumitras, “Shallow-Deep Networks: Understanding and Mitigating Network Overthinking,” International Conference on Machine Learning (ICLR), 2019.
- Y. Han, Y. Pu, Z. Lai, C. Wang, S. Song, J. Cao, W. Huang, C. Deng, and G. Huang, “Learning to Weight Samples for Dynamic Early-exiting Networks,” European Conference on Computer Vision (ECCV), 2022.
- F. Ilhan, L. Liu, K.-H. Chow, W. Wei, Y. Wu, M. Lee, R. Kompella, H. Latapie, and G. Liu, “EENet: Learning to Early Exit for Adaptive Inference,” arXiv preprint, 2023.
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” Neural Information Processing Systems (NeurIPS), 2012.
- Z. Qu, Z. Zhou, Y. Cheng, and L. Thiele, “Adaptive Loss-Aware Quantization for Multi-Bit Networks,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- A. Krizhevsky, G. Hinton et al., “Learning Multiple Layers of Features From Tiny Images,” 2009.
- O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, “ImageNet Large Scale Visual Recognition Challenge,” International Journal of Computer Vision (IJCV), vol. 115, no. 3, pp. 211–252, 2015.