A Unified DNN Weight Compression Framework Using Reweighted Optimization Methods (2004.05531v1)
Abstract: To address the large model size and intensive computation requirement of deep neural networks (DNNs), weight pruning techniques have been proposed and generally fall into two categories, i.e., static regularization-based pruning and dynamic regularization-based pruning. However, the former method currently suffers either complex workloads or accuracy degradation, while the latter one takes a long time to tune the parameters to achieve the desired pruning rate without accuracy loss. In this paper, we propose a unified DNN weight pruning framework with dynamically updated regularization terms bounded by the designated constraint, which can generate both non-structured sparsity and different kinds of structured sparsity. We also extend our method to an integrated framework for the combination of different DNN compression tasks.
- Tianyun Zhang (26 papers)
- Xiaolong Ma (57 papers)
- Zheng Zhan (27 papers)
- Shanglin Zhou (14 papers)
- Minghai Qin (28 papers)
- Fei Sun (151 papers)
- Yen-Kuang Chen (10 papers)
- Caiwen Ding (98 papers)
- Makan Fardad (19 papers)
- Yanzhi Wang (197 papers)