ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression (1707.06342v1)

Published 20 Jul 2017 in cs.CV

Abstract: We propose an efficient and unified framework, namely ThiNet, to simultaneously accelerate and compress CNN models in both training and inference stages. We focus on the filter level pruning, i.e., the whole filter would be discarded if it is less important. Our method does not change the original network structure, thus it can be perfectly supported by any off-the-shelf deep learning libraries. We formally establish filter pruning as an optimization problem, and reveal that we need to prune filters based on statistics information computed from its next layer, not the current layer, which differentiates ThiNet from existing methods. Experimental results demonstrate the effectiveness of this strategy, which has advanced the state-of-the-art. We also show the performance of ThiNet on ILSVRC-12 benchmark. ThiNet achieves 3.31$\times$ FLOPs reduction and 16.63$\times$ compression on VGG-16, with only 0.52$\%$ top-5 accuracy drop. Similar experiments with ResNet-50 reveal that even for a compact network, ThiNet can also reduce more than half of the parameters and FLOPs, at the cost of roughly 1$\%$ top-5 accuracy drop. Moreover, the original VGG-16 model can be further pruned into a very small model with only 5.05MB model size, preserving AlexNet level accuracy but showing much stronger generalization ability.

PDF Abstract

ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression

The paper "ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression" by Jian-Hao Luo, Jianxin Wu, and Weiyao Lin introduces a novel approach to effectively reduce the computational complexity and storage requirements of Convolutional Neural Networks (CNNs). The key concept revolves around filter-level pruning, a technique where less essential filters are systematically discarded to streamline the network. This method not only preserves the structure of the original network but also ensures compatibility with existing deep learning libraries without the need for specialized software or hardware.

Theoretical Framework and Optimization Problem

ThiNet distinguishes itself by establishing filter pruning as an optimization problem. Specifically, the method determines the importance of filters based on statistics computed from the next layer, as opposed to current-layer techniques prevalent in existing methodologies. This novel insight facilitates more accurate pruning decisions, underpinned by a clear mathematical formulation. The paper's objective function aims to minimize the reconstruction error of the subsequent layer’s outputs when specific filters are removed, leading to a well-defined optimization problem that can be efficiently solved using a greedy algorithm.

Experimental Evaluation and Results

The efficacy of ThiNet is rigorously evaluated on large-scale datasets such as ILSVRC-12. The method demonstrates substantial reductions in Floating Point Operations (FLOPs) and model parameters with minimal performance degradation. For instance, ThiNet achieves a 3.31 $\times$ reduction in FLOPs and a 16.63 $\times$ reduction in parameters for the VGG-16 model with only a 0.52% drop in top-5 accuracy. In the case of ResNet-50, the FLOPs and parameters are halved with an accuracy drop of approximately 1%. Such results highlight the robustness and effectiveness of ThiNet in compressing even compact networks like ResNet-50.

Comparative Analysis

The paper benchmarks ThiNet against several state-of-the-art pruning methods, including APoZ-based pruning and Taylor expansion-based criteria. ThiNet consistently outperforms these methods, particularly at higher compression rates. For instance, while APoZ and Taylor methods exhibit significant accuracy losses or require specialized setups, ThiNet maintains high accuracy levels due to its next-layer-driven pruning criterion. This comparison underscores ThiNet's superiority in balancing compression and accuracy.

Practical and Theoretical Implications

Practically, the ability to prune networks without altering their structure makes ThiNet highly applicable for deploying deep learning models on resource-constrained devices such as mobile phones and embedded systems. The vast reduction in memory footprint and computational demands extends the use of sophisticated CNN models to scenarios where they were previously infeasible. Theoretically, ThiNet's framework provides a new lens for understanding filter importance, paving the way for further research on optimizing CNN architectures.

Future Directions

The research opens several avenues for future exploration. One promising direction is extending the pruning strategy to more complex network components, such as the projection shortcuts in ResNet, which pose additional challenges due to their integral role in residual connections. Moreover, combining ThiNet with other compression techniques like parameter quantization could yield even more compact models. Additionally, the implications of ThiNet on other vision tasks such as object detection and semantic segmentation warrant comprehensive investigation.

In conclusion, ThiNet represents a significant advancement in the domain of deep learning model compression and acceleration. By leveraging a next-layer-driven optimization framework, it achieves substantial reductions in model size and computational load with minimal accuracy loss, thus enhancing the feasibility of deploying deep learning models in real-world, resource-constrained environments.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Jian-Hao Luo (7 papers)
Jianxin Wu (82 papers)
Weiyao Lin (87 papers)

Citations (1,703)

View on Semantic Scholar

ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression (1707.06342v1)