Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Resource-aware Deployment of Dynamic DNNs over Multi-tiered Interconnected Systems (2404.08060v1)

Published 11 Apr 2024 in cs.NI and eess.SP

Abstract: The increasing pervasiveness of intelligent mobile applications requires to exploit the full range of resources offered by the mobile-edge-cloud network for the execution of inference tasks. However, due to the heterogeneity of such multi-tiered networks, it is essential to make the applications' demand amenable to the available resources while minimizing energy consumption. Modern dynamic deep neural networks (DNN) achieve this goal by designing multi-branched architectures where early exits enable sample-based adaptation of the model depth. In this paper, we tackle the problem of allocating sections of DNNs with early exits to the nodes of the mobile-edge-cloud system. By envisioning a 3-stage graph-modeling approach, we represent the possible options for splitting the DNN and deploying the DNN blocks on the multi-tiered network, embedding both the system constraints and the application requirements in a convenient and efficient way. Our framework -- named Feasible Inference Graph (FIN) -- can identify the solution that minimizes the overall inference energy consumption while enabling distributed inference over the multi-tiered network with the target quality and latency. Our results, obtained for DNNs with different levels of complexity, show that FIN matches the optimum and yields over 65% energy savings relative to a state-of-the-art technique for cost minimization.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A survey of convolutional neural networks: analysis, applications, and prospects,” IEEE transactions on neural networks and learning systems, 2021.
  2. A. B. Nassif, I. Shahin, I. Attili, M. Azzeh, and K. Shaalan, “Speech recognition using deep neural networks: A systematic review,” IEEE access, vol. 7, pp. 19 143–19 165, 2019.
  3. I. Azimi, A. Anzanpour, A. M. Rahmani, T. Pahikkala, M. Levorato, P. Liljeberg, and N. Dutt, “Hich: Hierarchical fog-assisted computing architecture for healthcare iot,” ACM Transactions on Embedded Computing Systems, 2017.
  4. X. Zhang, M. Mounesan, and S. Debroy, “EFFECT-DNN: energy-efficient edge framework for real-time DNN inference,” in Proc. IEEE WoWMoM, Boston, MA, USA, June 2023, pp. 10–20.
  5. L. Deng, G. Li, S. Han, L. Shi, and Y. Xie, “Model compression and hardware acceleration for neural networks: A comprehensive survey,” Proceedings of the IEEE, vol. 108, no. 4, pp. 485–532, 2020.
  6. Q. Luo, S. Hu, C. Li, G. Li, and W. Shi, “Resource scheduling in edge computing: A survey,” IEEE Communications Surveys & Tutorials, vol. 23, no. 4, pp. 2131–2165, 2021.
  7. Y. Matsubara, M. Levorato, and F. Restuccia, “Split computing and early exiting for deep learning applications: Survey and research challenges,” ACM Computing Surveys, vol. 55, no. 5, pp. 1–30, 2022.
  8. Y. Kang, J. Hauswald, C. Gao, A. Rovinski, T. Mudge, J. Mars, and L. Tang, “Neurosurgeon: Collaborative intelligence between the cloud and mobile edge,” ACM SIGARCH Computer Architecture News, 2017.
  9. Y.-T. Yang and H.-Y. Wei, “Edge–iot computing and networking resource allocation for decomposable deep learning inference,” IEEE Internet of Things Journal, vol. 10, no. 6, pp. 5178–5193, 2022.
  10. W. Miao, Z. Zeng, L. Wei, S. Li, C. Jiang, and Z. Zhang, “Adaptive dnn partition in edge computing environments,” in IEEE ICPADS, 2020.
  11. J. Lee, H. Lee, and W. Choi, “Wireless channel adaptive dnn split inference for resource-constrained edge devices,” IEEE Communications Letters, 2023.
  12. W. Fan, L. Gao, Y. Su, F. Wu, and Y. Liu, “Joint dnn partition and resource allocation for task offloading in edge-cloud-assisted iot environments,” IEEE Internet of Things Journal, 2023.
  13. S. Teerapittayanon et al., “Branchynet: Fast inference via early exiting from deep neural networks,” in IEEE ICPR, 2016.
  14. R. Dong, Y. Mao, and J. Zhang, “Resource-constrained edge ai with early exit prediction,” Journal of Communications and Information Networks, vol. 7, no. 2, pp. 122–134, Jun. 2022.
  15. S. Laskaridis, A. Kouris, and N. D. Lane, “Adaptive inference through early-exit networks: Design, challenges and directions,” in Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning, 2021, pp. 1–6.
  16. G. Xue, A. Sen, W. Zhang, J. Tang, and K. Thulasiraman, “Finding a path subject to many additive QoS constraints,” IEEE/ACM Transactions on Networking, vol. 15, no. 1, pp. 201–211, Feb. 2007.
  17. L. Gouveia, M. Leitner, and M. Ruthmair, “Layered graph approaches for combinatorial optimization problems,” Computers & Operations Research, vol. 102, pp. 22–38, Feb. 2019.
  18. ——, “Extended formulations and branch-and-cut algorithms for the black-and-white traveling salesman problem,” European Journal of Operational Research, vol. 262, no. 3, pp. 908–928, Nov. 2017.
  19. S. Yang, F. Li, S. Trajanovski, X. Chen, Y. Wang, and X. Fu, “Delay-aware virtual network function placement and routing in edge clouds,” IEEE Transactions on Mobile Computing, 2021.
  20. F. K. Hwang et al., “Steiner tree problems,” Networks, 1992.
  21. G. Xue, A. Sen, W. Zhang, J. Tang, and K. Thulasiraman, “Finding a path subject to many additive qos constraints,” IEEE/ACM Transactions on networking, 2007.
  22. O. Elharrouss, Y. Akbari, N. Almaadeed, and S. Al-Maadeed, “Backbones-review: Feature extraction networks for deep learning and deep reinforcement learning approaches,” Jun. 2022. [Online]. Available: https://arxiv.org/abs/2206.08016
  23. J. Lei, Q. Luan, X. Song, X. Liu, D. Tao, and M. Song, “Action parsing-driven video summarization based on reinforcement learning,” IEEE Transactions on Circuits and Systems for Video Technology, 2019.
  24. F. Serpush and M. Rezae, “Complex human action recognition using a hierarchical feature reduction and deep learning-based method,” SN Computer Science, vol. 2, no. 94, pp. 1–15, Feb. 2021.
  25. A. Ullah, J. Ahmad, K. Muhammad, M. Sajjad, and S. W. Baik, “Action recognition in video sequences using deep bi-directional LSTM with CNN features,” IEEE Access, vol. 6, pp. 1155–1166, 2018.
  26. S. Darafsh, S. S. Ghidary, and M. S. Zamani, “Real-time activity recognition and intention recognition using a vision-based embedded system,” CoRR.
  27. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” 2015.
  28. Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, 1998.
  29. Y. LeCun, C. Cortes, and C. Burges, “Mnist handwritten digit database,” ATT Labs, 2010.
  30. G. Cohen, S. Afshar, J. Tapson, and A. van Schaik, “Emnist: an extension of mnist to handwritten letters,” arXiv preprint arXiv:1702.05373, 2017.
  31. F. Malandrino, C. F. Chiasserini, and G. di Giacomo, “Efficient distributed DNNs in the mobile-edge-cloud continuum,” IEEE/ACM Transactions on Networking (early access), pp. 1–15, Nov. 2022.
  32. F. Jalali, K. Hinton, R. Ayre, T. Alpcan, and R. S. Tucker, “Fog computing may help to save energy in cloud computing,” IEEE Journal on Selected Areas in Communications, 2016.
  33. Y. Li, A.-C. Orgerie, I. Rodero, B. L. Amersho, M. Parashar, and J.-M. Menaud, “End-to-end energy models for edge cloud-based IoT platforms: Application to data stream analysis in IoT,” Future Generation Computer Systems, vol. 87, pp. 667–678, Oct. 2018.
  34. L. Sun, H. Deng, R. K. Sheshadri, W. Zheng, and D. Koutsonikolas, “Experimental evaluation of WiFi active power/energy consumption models for smartphones,” IEEE Transactions on Mobile Computing, vol. 16, no. 1, pp. 115–129, Mar. 2017.
  35. C. Buciluǎ, R. Caruana, and A. Niculescu-Mizil, “Model compression,” in ACM SIGKDD, 2006.
  36. C.-H. Chiang, P. Liu, D.-W. Wang, D.-Y. Hong, and J.-J. Wu, “Optimal branch location for cost-effective inference on branchynet,” in IEEE Big Data, 2021.
  37. Y. Matsubara, S. Baidya, D. Callegaro, M. Levorato, and S. Singh, “Distilled split deep neural networks for edge-assisted real-time systems,” in ACM HotEdgeVideo, 2019.
  38. A. E. Eshratifar, A. Esmaili, and M. Pedram, “Bottlenet: A deep learning architecture for intelligent mobile cloud computing services,” in IEEE/ACM ISLPED, 2019.
  39. J. C. Lee, Y. Kim, S. Moon, and J. H. Ko, “A splittable dnn-based object detector for edge-cloud collaborative real-time video inference,” in IEEE AVSS, 2021.
  40. Y. Matsubara, D. Callegaro, S. Singh, M. Levorato, and F. Restuccia, “Bottlefit: Learning compressed representations in deep neural networks for effective and efficient split computing,” in IEEE WoWMoM, 2022.
  41. S. Wang, T. Tuor, T. Salonidis, K. K. Leung, C. Makaya, T. He, and K. Chan, “Adaptive federated learning in resource constrained edge computing systems,” IEEE Journal on Selected Areas in Communications, 2019.
  42. F. Malandrino and C. F. Chiasserini, “Federated learning at the network edge: When not all nodes are created equal,” IEEE Communications Magazine, 2021.
  43. H. Wu and P. Wang, “Fast-convergent federated learning with adaptive weighting,” IEEE Transactions on Cognitive Communications and Networking, 2021.
  44. Y. Zhou, Q. Ye, and J. C. Lv, “Communication-Efficient Federated Learning with Compensated Overlap-FedAvg,” IEEE Transactions on Parallel and Distributed Systems, 2021.
  45. F. Paissan, A. Ancilotto, A. Brutti, and E. Farella, “Scalable neural architectures for end-to-end environmental sound classification,” in IEEE ICASSP, 2022.
  46. F. Ang, L. Chen, N. Zhao, Y. Chen, W. Wang, and F. R. Yu, “Robust federated learning with noisy communication,” IEEE Transactions on Communications, 2020.
  47. S. Li, Y. Cheng, W. Wang, Y. Liu, and T. Chen, “Learning to detect malicious clients for robust federated learning,” arXiv preprint arXiv:2002.00211, 2020.
  48. T. Li, S. Hu, A. Beirami, and V. Smith, “Ditto: Fair and robust federated learning through personalization,” in ICML, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Chetna Singhal (8 papers)
  2. Yashuo Wu (2 papers)
  3. Francesco Malandrino (57 papers)
  4. Marco Levorato (50 papers)
  5. Carla Fabiana Chiasserini (61 papers)
Citations (2)