2000 character limit reached
PCNN: Pattern-based Fine-Grained Regular Pruning towards Optimizing CNN Accelerators (2002.04997v2)
Published 11 Feb 2020 in cs.LG and stat.ML
Abstract: Weight pruning is a powerful technique to realize model compression. We propose PCNN, a fine-grained regular 1D pruning method. A novel index format called Sparsity Pattern Mask (SPM) is presented to encode the sparsity in PCNN. Leveraging SPM with limited pruning patterns and non-zero sequences with equal length, PCNN can be efficiently employed in hardware. Evaluated on VGG-16 and ResNet-18, our PCNN achieves the compression rate up to 8.4X with only 0.2% accuracy loss. We also implement a pattern-aware architecture in 55nm process, achieving up to 9.0X speedup and 28.39 TOPS/W efficiency with only 3.1% on-chip memory overhead of indices.
- Zhanhong Tan (6 papers)
- Jiebo Song (8 papers)
- Xiaolong Ma (57 papers)
- Sia-Huat Tan (1 paper)
- Hongyang Chen (61 papers)
- Yuanqing Miao (1 paper)
- Yifu Wu (7 papers)
- Shaokai Ye (20 papers)
- Yanzhi Wang (197 papers)
- Dehui Li (12 papers)
- Kaisheng Ma (46 papers)