Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

NPAS: A Compiler-aware Framework of Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration (2012.00596v3)

Published 1 Dec 2020 in cs.LG, cs.AI, cs.CV, and cs.NE

Abstract: With the increasing demand to efficiently deploy DNNs on mobile edge devices, it becomes much more important to reduce unnecessary computation and increase the execution speed. Prior methods towards this goal, including model compression and network architecture search (NAS), are largely performed independently and do not fully consider compiler-level optimizations which is a must-do for mobile acceleration. In this work, we first propose (i) a general category of fine-grained structured pruning applicable to various DNN layers, and (ii) a comprehensive, compiler automatic code generation framework supporting different DNNs and different pruning schemes, which bridge the gap of model compression and NAS. We further propose NPAS, a compiler-aware unified network pruning, and architecture search. To deal with large search space, we propose a meta-modeling procedure based on reinforcement learning with fast evaluation and Bayesian optimization, ensuring the total number of training epochs comparable with representative NAS frameworks. Our framework achieves 6.7ms, 5.9ms, 3.9ms ImageNet inference times with 78.2%, 75% (MobileNet-V3 level), and 71% (MobileNet-V2 level) Top-1 accuracy respectively on an off-the-shelf mobile phone, consistently outperforming prior work.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (16)
  1. Zhengang Li (31 papers)
  2. Geng Yuan (58 papers)
  3. Wei Niu (68 papers)
  4. Pu Zhao (82 papers)
  5. Yanyu Li (31 papers)
  6. Yuxuan Cai (25 papers)
  7. Xuan Shen (29 papers)
  8. Zheng Zhan (27 papers)
  9. Zhenglun Kong (33 papers)
  10. Qing Jin (17 papers)
  11. Zhiyu Chen (60 papers)
  12. Sijia Liu (204 papers)
  13. Kaiyuan Yang (32 papers)
  14. Bin Ren (136 papers)
  15. Yanzhi Wang (197 papers)
  16. Xue Lin (92 papers)
Citations (25)