Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Deep Learning Inference on Heterogeneous Mobile Processors: Potentials and Pitfalls (2405.01851v1)

Published 3 May 2024 in cs.LG and cs.AI

Abstract: There is a growing demand to deploy computation-intensive deep learning (DL) models on resource-constrained mobile devices for real-time intelligent applications. Equipped with a variety of processing units such as CPUs, GPUs, and NPUs, the mobile devices hold potential to accelerate DL inference via parallel execution across heterogeneous processors. Various efficient parallel methods have been explored to optimize computation distribution, achieve load balance, and minimize communication cost across processors. Yet their practical effectiveness in the dynamic and diverse real-world mobile environment is less explored. This paper presents a holistic empirical study to assess the capabilities and challenges associated with parallel DL inference on heterogeneous mobile processors. Through carefully designed experiments covering various DL models, mobile software/hardware environments, workload patterns, and resource availability, we identify limitations of existing techniques and highlight opportunities for cross-level optimization.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Sicong Liu (38 papers)
  2. Wentao Zhou (7 papers)
  3. Zimu Zhou (19 papers)
  4. Bin Guo (150 papers)
  5. Minfan Wang (1 paper)
  6. Cheng Fang (33 papers)
  7. Zheng Lin (104 papers)
  8. Zhiwen Yu (78 papers)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets