Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
157 tokens/sec
GPT-4o
43 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Temporal Decisions: Leveraging Temporal Correlation for Efficient Decisions in Early Exit Neural Networks (2403.07958v1)

Published 12 Mar 2024 in cs.LG and cs.AI

Abstract: Deep Learning is becoming increasingly relevant in Embedded and Internet-of-things applications. However, deploying models on embedded devices poses a challenge due to their resource limitations. This can impact the model's inference accuracy and latency. One potential solution are Early Exit Neural Networks, which adjust model depth dynamically through additional classifiers attached between their hidden layers. However, the real-time termination decision mechanism is critical for the system's efficiency, latency, and sustained accuracy. This paper introduces Difference Detection and Temporal Patience as decision mechanisms for Early Exit Neural Networks. They leverage the temporal correlation present in sensor data streams to efficiently terminate the inference. We evaluate their effectiveness in health monitoring, image classification, and wake-word detection tasks. Our novel contributions were able to reduce the computational footprint compared to established decision mechanisms significantly while maintaining higher accuracy scores. We achieved a reduction of mean operations per inference by up to 80% while maintaining accuracy levels within 5% of the original model. These findings highlight the importance of considering temporal correlation in sensor data to improve the termination decision.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. M. Amthor, E. Rodner, and J. Denzler, “Impatient dnns-deep neural networks with dynamic time budgets,” arXiv preprint arXiv:1610.02850, 2016.
  2. H. Hu, D. Dey, M. Hebert, and J. A. Bagnell, “Learning anytime predictions in neural networks via adaptive loss balancing,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, 2019, pp. 3812–3821.
  3. P. Panda, A. Sengupta, and K. Roy, “Conditional Deep Learning for energy-efficient and enhanced pattern recognition,” in 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE), Mar. 2016, pp. 475–480.
  4. E. Park, D. Kim, S. Kim, Y.-D. Kim, G. Kim, S. Yoon, and S. Yoo, “Big/little deep neural network for ultra low power inference,” in 2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), Oct. 2015, pp. 124–132.
  5. T. Bolukbasi, J. Wang, O. Dekel, and V. Saligrama, “Adaptive Neural Networks for Efficient Inference,” in Proceedings of the 34th International Conference on Machine Learning.   PMLR, Jul. 2017, pp. 527–536.
  6. A. Odena, D. Lawson, and C. Olah, “Changing model behavior at test-time using reinforcement learning,” arXiv preprint arXiv:1702.07780, 2017.
  7. S. Teerapittayanon, B. McDanel, and H. Kung, “BranchyNet: Fast inference via early exiting from deep neural networks,” in 2016 23rd International Conference on Pattern Recognition (ICPR), Dec. 2016, pp. 2464–2469.
  8. A. Nguyen, J. Yosinski, and J. Clune, “Deep Neural Networks Are Easily Fooled: High Confidence Predictions for Unrecognizable Images,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 427–436.
  9. W. Zhou, C. Xu, T. Ge, J. McAuley, K. Xu, and F. Wei, “BERT Loses Patience: Fast and Robust Inference with Early Exit,” in Advances in Neural Information Processing Systems, vol. 33.   Curran Associates, Inc., 2020, pp. 18 330–18 341.
  10. N. Rashid, B. U. Demirel, M. Odema, and M. A. Al Faruque, “Template Matching Based Early Exit CNN for Energy-efficient Myocardial Infarction Detection on Low-power Wearable Devices,” Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 6, no. 2, pp. 68:1–68:22, Jul. 2022.
  11. M. Sponner, J. Ott, L. Servadei, B. Waschneck, R. Wille, and A. Kumar, “Temporal patience: Efficient adaptive deep learning for embedded radar data processing,” arXiv preprint arXiv:2309.05686, vol. abs/2309.05686, 2023.
  12. Y. Kaya, S. Hong, and T. Dumitras, “Shallow-Deep Networks: Understanding and Mitigating Network Overthinking,” in Proceedings of the 36th International Conference on Machine Learning.   PMLR, May 2019, pp. 3301–3310.
  13. P. Wagner, N. Strodthoff, R.-D. Bousseljot, D. Kreiseler, F. I. Lunze, W. Samek, and T. Schaeffter, “Ptb-xl, a large publicly available electrocardiography dataset,” Scientific data, vol. 7, no. 1, p. 154, 2020.
  14. A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” 2009.
  15. Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE transactions on image processing, vol. 13, no. 4, pp. 600–612, 2004.
  16. P. Warden, “Speech commands: A dataset for limited-vocabulary speech recognition,” arXiv preprint arXiv:1804.03209, 2018.
  17. Y. Zhang, N. Suda, L. Lai, and V. Chandra, “Hello Edge: Keyword Spotting on Microcontrollers,” Feb. 2018.
Citations (1)

Summary

We haven't generated a summary for this paper yet.