Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Slimmable Encoders for Flexible Split DNNs in Bandwidth and Resource Constrained IoT Systems (2306.12691v1)

Published 22 Jun 2023 in cs.LG

Abstract: The execution of large deep neural networks (DNN) at mobile edge devices requires considerable consumption of critical resources, such as energy, while imposing demands on hardware capabilities. In approaches based on edge computing the execution of the models is offloaded to a compute-capable device positioned at the edge of 5G infrastructures. The main issue of the latter class of approaches is the need to transport information-rich signals over wireless links with limited and time-varying capacity. The recent split computing paradigm attempts to resolve this impasse by distributing the execution of DNN models across the layers of the systems to reduce the amount of data to be transmitted while imposing minimal computing load on mobile devices. In this context, we propose a novel split computing approach based on slimmable ensemble encoders. The key advantage of our design is the ability to adapt computational load and transmitted data size in real-time with minimal overhead and time. This is in contrast with existing approaches, where the same adaptation requires costly context switching and model loading. Moreover, our model outperforms existing solutions in terms of compression efficacy and execution time, especially in the context of weak mobile devices. We present a comprehensive comparison with the most advanced split computing solutions, as well as an experimental evaluation on GPU-less devices.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. “Split computing and early exiting for deep learning applications: Survey and research challenges,” arXiv preprint arXiv:2103.04505, 2021.
  2. “Neurosurgeon: Collaborative intelligence between the cloud and mobile edge,” ACM SIGARCH Computer Architecture News, vol. 45, no. 1, pp. 615–629, 2017.
  3. “Deepwear: Adaptive local offloading for on-wearable deep learning,” IEEE Transactions on Mobile Computing, vol. 19, no. 2, pp. 314–330, 2019.
  4. “Efficientdet: Scalable and efficient object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 10781–10790.
  5. “Collaborative intelligence: Challenges and opportunities,” arXiv preprint arXiv:2102.06841, 2021.
  6. “Bottlenet: A deep learning architecture for intelligent mobile cloud computing services,” in 2019 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED). IEEE, 2019, pp. 1–6.
  7. “Distilled split deep neural networks for edge-assisted real-time systems,” in Proceedings of the 2019 Workshop on Hot Topics in Video Analytics and Intelligent Edges, 2019, pp. 21–26.
  8. “Back-and-forth prediction for deep tensor compression,” in ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020, pp. 4467–4471.
  9. “Practical full resolution learned lossless image compression,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 10629–10638.
  10. “Density modeling of images using a generalized normalization transformation,” arXiv preprint arXiv:1511.06281, 2015.
  11. “Supervised Compression for Resource-Constrained Edge Computing Systems,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), January 2022, pp. 2685–2695.
  12. “Sc2: Supervised compression for split computing,” arXiv preprint arXiv:2203.08875, 2022.
  13. “End-to-end optimized image compression,” arXiv preprint arXiv:1611.01704, 2016.
  14. “Microsoft coco: Common objects in context,” in European conference on computer vision. Springer, 2014, pp. 740–755.
  15. “Collaborative object detectors adaptive to bandwidth and computation,” in ICASSP 2022-2022 IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022, pp. 4467–4471.
  16. “Lightweight compression of neural network feature tensors for collaborative intelligence,” in 2020 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2020, pp. 1–6.
  17. “Variational image compression with a scale hyperprior,” in 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings. feb 2018, International Conference on Learning Representations, ICLR.
  18. “A splittable dnn-based object detector for edge-cloud collaborative real-time video inference,” in 2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, 2021, pp. 1–8.
  19. “Slimmable neural networks,” arXiv preprint arXiv:1812.08928, 2018.
  20. “Universally slimmable networks and improved training techniques,” in Proceedings of the IEEE International Conference on Computer Vision, 2019, pp. 1803–1811.
  21. “Learning neural network subspaces,” in International Conference on Machine Learning. PMLR, 2021, pp. 11217–11227.
  22. “Lcs: Learning compressible subspaces for adaptive network compression at inference time,” arXiv preprint arXiv:2110.04252, 2021.
  23. “Compressing representations for embedded deep learning,” arXiv preprint arXiv:1911.10321, 2019.
  24. “Communication-computation trade-off in resource-constrained edge inference,” arXiv preprint arXiv:2006.02166, 2020.
  25. “Splitnets: Designing neural architectures for efficient distributed computing on head-mounted systems,” arXiv preprint arXiv:2204.04705, 2022.
  26. “Bottlenet++: An end-to-end approach for feature compression in device-edge co-inference systems,” in 2020 IEEE International Conference on Communications Workshops (ICC Workshops). IEEE, 2020, pp. 1–6.
  27. “Joint device-edge inference over wireless links with pruning,” in 2020 IEEE 21st International Workshop on Signal Processing Advances in Wireless Communications (SPAWC). IEEE, 2020, pp. 1–5.
  28. “Head network distillation: Splitting distilled deep neural networks for resource-constrained edge computing systems,” IEEE Access, vol. 8, pp. 212177–212193, 2020.
  29. “Yolov4: Optimal speed and accuracy of object detection,” arXiv preprint arXiv:2004.10934, 2020.
  30. “Efficientnetv2: Smaller models and faster training,” in International Conference on Machine Learning. PMLR, 2021, pp. 10096–10106.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Juliano S. Assine (3 papers)
  2. J. C. S. Santos Filho (2 papers)
  3. Eduardo Valle (50 papers)
  4. Marco Levorato (50 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.