NACHOS: Neural Architecture Search for Hardware Constrained Early Exit Neural Networks
Abstract: Early Exit Neural Networks (EENNs) endow astandard Deep Neural Network (DNN) with Early Exit Classifiers (EECs), to provide predictions at intermediate points of the processing when enough confidence in classification is achieved. This leads to many benefits in terms of effectiveness and efficiency. Currently, the design of EENNs is carried out manually by experts, a complex and time-consuming task that requires accounting for many aspects, including the correct placement, the thresholding, and the computational overhead of the EECs. For this reason, the research is exploring the use of Neural Architecture Search (NAS) to automatize the design of EENNs. Currently, few comprehensive NAS solutions for EENNs have been proposed in the literature, and a fully automated, joint design strategy taking into consideration both the backbone and the EECs remains an open problem. To this end, this work presents Neural Architecture Search for Hardware Constrained Early Exit Neural Networks (NACHOS), the first NAS framework for the design of optimal EENNs satisfying constraints on the accuracy and the number of Multiply and Accumulate (MAC) operations performed by the EENNs at inference time. In particular, this provides the joint design of backbone and EECs to select a set of admissible (i.e., respecting the constraints) Pareto Optimal Solutions in terms of best tradeoff between the accuracy and number of MACs. The results show that the models designed by NACHOS are competitive with the state-of-the-art EENNs. Additionally, this work investigates the effectiveness of two novel regularization terms designed for the optimization of the auxiliary classifiers of the EENN
- S. Teerapittayanon, B. McDanel, and H. T. Kung, “BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks,” Sep. 2017, arXiv:1709.01686 [cs]. [Online]. Available: http://arxiv.org/abs/1709.01686
- S. Scardapane, M. Scarpiniti, E. Baccarelli, and A. Uncini, “Why Should We Add Early Exits to Neural Networks?” Cognitive Computation, vol. 12, no. 5, pp. 954–966, Sep. 2020. [Online]. Available: https://link.springer.com/10.1007/s12559-020-09734-4
- X. Ying, “An Overview of Overfitting and its Solutions,” Journal of Physics: Conference Series, vol. 1168, p. 022022, Feb. 2019. [Online]. Available: https://iopscience.iop.org/article/10.1088/1742-6596/1168/2/022022
- X. Wang, Y. Luo, D. Crankshaw, A. Tumanov, F. Yu, and J. E. Gonzalez, “IDK Cascades: Fast Deep Learning by Learning not to Overthink,” Jun. 2018, arXiv:1706.00885 [cs]. [Online]. Available: http://arxiv.org/abs/1706.00885
- Y. Kaya, S. Hong, and T. Dumitras, “Shallow-Deep Networks: Understanding and Mitigating Network Overthinking,” May 2019, arXiv:1810.07052 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1810.07052
- R. Pascanu, T. Mikolov, and Y. Bengio, “On the difficulty of training Recurrent Neural Networks,” Feb. 2013, arXiv:1211.5063 [cs]. [Online]. Available: http://arxiv.org/abs/1211.5063
- J. Frankle and M. Carbin, “The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks,” Mar. 2019, arXiv:1803.03635 [cs]. [Online]. Available: http://arxiv.org/abs/1803.03635
- J. Deng, W. Dong, R. Socher, L.-J. Li, Kai Li, and Li Fei-Fei, “ImageNet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL: IEEE, Jun. 2009, pp. 248–255. [Online]. Available: https://ieeexplore.ieee.org/document/5206848/
- S. Scardapane, D. Comminiello, M. Scarpiniti, E. Baccarelli, and A. Uncini, “Differentiable Branching In Deep Networks for Fast Inference,” in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Barcelona, Spain: IEEE, May 2020, pp. 4167–4171. [Online]. Available: https://ieeexplore.ieee.org/document/9054209/
- C. Xu and J. McAuley, “A Survey on Dynamic Neural Networks for Natural Language Processing.”
- W. Zhu, P. Wang, Y. Ni, G. Xie, and X. Wang, “BADGE: Speeding Up BERT Inference after Deployment via Block-wise Bypasses and Divergence-based Early Exiting,” in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track). Toronto, Canada: Association for Computational Linguistics, 2023, pp. 500–509. [Online]. Available: https://aclanthology.org/2023.acl-industry.48
- E. Lattanzi, C. Contoli, and V. Freschi, “Do we need early exit networks in human activity recognition?” Engineering Applications of Artificial Intelligence, vol. 121, p. 106035, May 2023. [Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/S0952197623002191
- A. Sabet, J. Hare, B. Al-Hashimi, and G. V. Merrett, “Temporal Early Exits for Efficient Video Object Detection,” Jun. 2021, arXiv:2106.11208 [cs]. [Online]. Available: http://arxiv.org/abs/2106.11208
- Y. Rao, W. Zhao, B. Liu, J. Lu, J. Zhou, and C.-J. Hsieh, “DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification,” Oct. 2021, arXiv:2106.02034 [cs]. [Online]. Available: http://arxiv.org/abs/2106.02034
- E. Park, D. Kim, S. Kim, Y.-D. Kim, G. Kim, S. Yoon, and S. Yoo, “Big/little deep neural network for ultra low power inference,” in 2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). Amsterdam, Netherlands: IEEE, Oct. 2015, pp. 124–132. [Online]. Available: http://ieeexplore.ieee.org/document/7331375/
- C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger, “On calibration of modern neural networks,” in Proceedings of the 34th International Conference on Machine Learning (ICML), 2017, pp. 1321–1330.
- P. Ren, Y. Xiao, X. Chang, P.-Y. Huang, Z. Li, X. Chen, and X. Wang, “A comprehensive survey of neural architecture search: Challenges and solutions,” arXiv preprint arXiv:2006.02903, 2020.
- H. Benmeziane, K. El Maghraoui, H. Ouarnoughi, S. Niar, M. Wistuba, and N. Wang, “Hardware-Aware Neural Architecture Search: Survey and Taxonomy,” in Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence. Montreal, Canada: International Joint Conferences on Artificial Intelligence Organization, Aug. 2021, pp. 4322–4329. [Online]. Available: https://www.ijcai.org/proceedings/2021/592
- M. Gambella, A. Falcetta, and M. Roveri, “CNAS: Constrained Neural Architecture Search,” in 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC). Prague, Czech Republic: IEEE, Oct. 2022, pp. 2918–2923. [Online]. Available: https://ieeexplore.ieee.org/document/9945080/
- T. King, Y. Zhou, T. Röddiger, and M. Beigl, “MicroNAS: Memory and Latency Constrained Hardware-Aware Neural Architecture Search for Time Series Classification on Microcontrollers,” Oct. 2023, arXiv:2310.18384 [cs]. [Online]. Available: http://arxiv.org/abs/2310.18384
- H. Bouzidi, M. Odema, H. Ouarnoughi, M. A. A. Faruque, and S. Niar, “HADAS: Hardware-Aware Dynamic Neural Architecture Search for Edge Performance Scaling,” Dec. 2022, arXiv:2212.03354 [cs]. [Online]. Available: http://arxiv.org/abs/2212.03354
- S. Li, Y. Sun, G. G. Yen, and M. Zhang, “Automatic Design of Convolutional Neural Network Architectures Under Resource Constraints,” IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 8, pp. 3832–3846, Aug. 2023. [Online]. Available: https://ieeexplore.ieee.org/document/9609007/
- M. Odema, N. Rashid, and M. A. A. Faruque, “EExNAS: Early-Exit Neural Architecture Search Solutions for Low-Power Wearable Devices,” in 2021 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED). Boston, MA, USA: IEEE, Jul. 2021, pp. 1–6. [Online]. Available: https://ieeexplore.ieee.org/document/9502503/
- B. Zeinali, D. Zhuang, and J. M. Chang, “ESAI: Efficient Split Artificial Intelligence via Early Exiting Using Neural Architecture Search,” Jun. 2021, arXiv:2106.12549 [cs]. [Online]. Available: http://arxiv.org/abs/2106.12549
- M. Gambella and M. Roveri, “Edanas: Adaptive neural architecture search for early exit neural networks,” in 2023 International Joint Conference on Neural Networks (IJCNN), 2023, pp. 1–8.
- H. Cai, C. Gan, T. Wang, Z. Zhang, and S. Han, “Once-for-All: Train One Network and Specialize it for Efficient Deployment,” Apr. 2020, arXiv:1908.09791 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1908.09791
- S. Sarti, E. Lomurno, A. Falanti, and M. Matteucci, “Enhancing once-for-all: A study on parallel blocks, skip connections and early exits,” in 2023 International Joint Conference on Neural Networks (IJCNN), 2023, pp. 1–9.
- K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: NSGA-II,” IEEE Transactions on Evolutionary Computation, vol. 6, no. 2, pp. 182–197, Apr. 2002. [Online]. Available: http://ieeexplore.ieee.org/document/996017/
- E. S. Marquez, J. S. Hare, and M. Niranjan, “Deep Cascade Learning,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 11, pp. 5475–5485, Nov. 2018. [Online]. Available: https://ieeexplore.ieee.org/document/8307262/
- S. Abbasi, M. Hajabdollahi, N. Karimi, and S. Samavi, “Modeling teacher-student techniques in deep neural networks for knowledge distillation,” CoRR, vol. abs/1912.13179, 2019. [Online]. Available: http://arxiv.org/abs/1912.13179
- D. Stamoulis, T.-W. Chin, A. K. Prakash, H. Fang, S. Sajja, M. Bognar, and D. Marculescu, “Designing Adaptive Neural Networks for Energy-Constrained Image Classification,” Aug. 2018, arXiv:1808.01550 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1808.01550
- J. Pomponi, S. Scardapane, and A. Uncini, “A probabilistic re-intepretation of confidence scores in multi-exit models,” Entropy, vol. 24, no. 1, 2022. [Online]. Available: https://www.mdpi.com/1099-4300/24/1/1
- M. Wistuba, A. Rawat, and T. Pedapati, “A Survey on Neural Architecture Search,” Jun. 2019, arXiv:1905.01392 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1905.01392
- A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan, Q. V. Le, and H. Adam, “Searching for MobileNetV3,” Nov. 2019, arXiv:1905.02244 [cs]. [Online]. Available: http://arxiv.org/abs/1905.02244
- R. Sutton and A. Barto, “Reinforcement learning: An introduction,” IEEE Transactions on Neural Networks, vol. 9, no. 5, pp. 1054–1054, 1998.
- L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement learning: A survey,” 1996.
- H. Liu, K. Simonyan, and Y. Yang, “DARTS: Differentiable Architecture Search,” Apr. 2019, arXiv:1806.09055 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1806.09055
- Y. Liu, Y. Sun, B. Xue, M. Zhang, G. G. Yen, and K. C. Tan, “A survey on evolutionary neural architecture search,” IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 2, p. 550–570, Feb. 2023. [Online]. Available: http://dx.doi.org/10.1109/TNNLS.2021.3100554
- M. S. Mahbub, “A comparative study on constraint handling techniques of nsgaii,” in 2020 International Conference on Electrical, Communication, and Computer Engineering (ICECCE), 2020, pp. 1–5.
- C. A. Coello-Coello, “Theoretical and numerical constraint-handling techniques used with evolutionary algorithms: a survey of the state of the art,” Computer Methods in Applied Mechanics and Engineering, 2002. [Online]. Available: https://api.semanticscholar.org/CorpusID:9235579
- X. Dai, P. Zhang, B. Wu, H. Yin, F. Sun, Y. Wang, M. Dukhan, Y. Hu, Y. Wu, Y. Jia, P. Vajda, M. Uyttendaele, and N. K. Jha, “ChamNet: Towards Efficient Network Design through Platform-Aware Model Adaptation,” Dec. 2018, arXiv:1812.08934 [cs]. [Online]. Available: http://arxiv.org/abs/1812.08934
- B. Baker, O. Gupta, R. Raskar, and N. Naik, “Accelerating neural architecture search using performance prediction,” arXiv preprint arXiv:1705.10823, 2017.
- Z. Lu, K. Deb, E. Goodman, W. Banzhaf, and V. N. Boddeti, “NSGANetV2: Evolutionary Multi-Objective Surrogate-Assisted Neural Architecture Search,” Jul. 2020, arXiv:2007.10396 [cs]. [Online]. Available: http://arxiv.org/abs/2007.10396
- M. G. Kendall, “A new measure of rank correlation,” Biometrika, vol. 30, no. 1/2, pp. 81–93, 1938. [Online]. Available: http://www.jstor.org/stable/2332226
- T. Wei, C. Wang, Y. Rui, and C. W. Chen, “Network morphism,” 2016.
- E. Baccarelli, S. Scardapane, M. Scarpiniti, A. Momenzadeh, and A. Uncini, “Optimized training and scalable implementation of Conditional Deep Neural Networks with early exits for Fog-supported IoT applications,” Information Sciences, vol. 521, pp. 107–143, Jun. 2020. [Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/S0020025520301249
- A. Krizhevsky, “Learning multiple layers of features from tiny images,” 2009.
- L. N. Darlow, E. J. Crowley, A. Antoniou, and A. J. Storkey, “CINIC-10 is not ImageNet or CIFAR-10,” Oct. 2018, arXiv:1810.03505 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1810.03505
- Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng, “Reading Digits in Natural Images with Unsupervised Feature Learning.”
- S. Disabato and M. Roveri, “Reducing the Computation Load of Convolutional Neural Networks through Gate Classification,” in 2018 International Joint Conference on Neural Networks (IJCNN). Rio de Janeiro: IEEE, Jul. 2018, pp. 1–8. [Online]. Available: https://ieeexplore.ieee.org/document/8489276/
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” CoRR, vol. abs/1512.03385, 2015. [Online]. Available: http://arxiv.org/abs/1512.03385
- M. P. Naeini, G. Cooper, and M. Hauskrecht, “Obtaining well calibrated probabilities using bayesian binning,” in Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015.
- C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger, “On Calibration of Modern Neural Networks,” Aug. 2017, arXiv:1706.04599 [cs]. [Online]. Available: http://arxiv.org/abs/1706.04599
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.