Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization (2004.11250v1)

Published 22 Apr 2020 in cs.LG, cs.CV, and cs.MM

Abstract: High-end mobile platforms rapidly serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications. However, the constrained computation and storage resources on these devices still pose significant challenges for real-time DNN inference executions. To address this problem, we propose a set of hardware-friendly structured model pruning and compiler optimization techniques to accelerate DNN executions on mobile devices. This demo shows that these optimizations can enable real-time mobile execution of multiple DNN applications, including style transfer, DNN coloring and super resolution.

PDF Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

Authors (6)

Wei Niu (68 papers)
Pu Zhao (82 papers)
Zheng Zhan (27 papers)
Xue Lin (92 papers)
Yanzhi Wang (197 papers)
Bin Ren (136 papers)

Citations (5)

View on Semantic Scholar

Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization (2004.11250v1)

Related Papers