2000 character limit reached
Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization (2004.11250v1)
Published 22 Apr 2020 in cs.LG, cs.CV, and cs.MM
Abstract: High-end mobile platforms rapidly serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications. However, the constrained computation and storage resources on these devices still pose significant challenges for real-time DNN inference executions. To address this problem, we propose a set of hardware-friendly structured model pruning and compiler optimization techniques to accelerate DNN executions on mobile devices. This demo shows that these optimizations can enable real-time mobile execution of multiple DNN applications, including style transfer, DNN coloring and super resolution.
- Wei Niu (68 papers)
- Pu Zhao (82 papers)
- Zheng Zhan (27 papers)
- Xue Lin (92 papers)
- Yanzhi Wang (197 papers)
- Bin Ren (136 papers)