NISP: Pruning Networks using Neuron Importance Score Propagation (1711.05908v3)

Published 16 Nov 2017 in cs.CV

Abstract: To reduce the significant redundancy in deep Convolutional Neural Networks (CNNs), most existing methods prune neurons by only considering statistics of an individual layer or two consecutive layers (e.g., prune one layer to minimize the reconstruction error of the next layer), ignoring the effect of error propagation in deep networks. In contrast, we argue that it is essential to prune neurons in the entire neuron network jointly based on a unified goal: minimizing the reconstruction error of important responses in the "final response layer" (FRL), which is the second-to-last layer before classification, for a pruned network to retrain its predictive power. Specifically, we apply feature ranking techniques to measure the importance of each neuron in the FRL, and formulate network pruning as a binary integer optimization problem and derive a closed-form solution to it for pruning neurons in earlier layers. Based on our theoretical analysis, we propose the Neuron Importance Score Propagation (NISP) algorithm to propagate the importance scores of final responses to every neuron in the network. The CNN is pruned by removing neurons with least importance, and then fine-tuned to retain its predictive power. NISP is evaluated on several datasets with multiple CNN models and demonstrated to achieve significant acceleration and compression with negligible accuracy loss.

Authors (9)

Ruichi Yu (15 papers)
Ang Li (472 papers)
Chun-Fu Chen (28 papers)
Jui-Hsin Lai (7 papers)
Vlad I. Morariu (31 papers)
Xintong Han (36 papers)
Mingfei Gao (26 papers)
Ching-Yung Lin (3 papers)
Larry S. Davis (98 papers)

Citations (764)

View on Semantic Scholar

Summary

Neuron Importance Score Propagation (NISP) for Neural Network Pruning: A Detailed Insight

The paper "NISP: Pruning Networks using Neuron Importance Score Propagation" addresses the inherent redundancy in deep Convolutional Neural Networks (CNNs) and presents an effective pruning strategy that focuses on neuron importance propagation across the entire network for computational efficiency and memory reduction. Unlike prior methodologies that consider neuron importance based on local statistics within a layer or a pair of layers, this approach provides a comprehensive framework to prune nodes globally by propagating the importance scores from the final response layer throughout the network.

Key Contributions

The main contributions of this paper are twofold:

Theoretical Formulation: The neuron pruning problem is formulated as a binary integer optimization objective aimed at minimizing the reconstruction error in important responses at the final response layer (FRL). This provides a clear, mathematical underpinning for the network pruning process.
Neuron Importance Score Propagation Algorithm: The paper introduces the NISP algorithm that propagates the importance scores from the high-level final responses to every neuron in earlier layers. This propagation leverages the weight matrices of the network, ensuring that pruning retains the predictive power of the CNN.

Detailed Methodology

Feature Ranking on Final Response Layer

The authors argue that for a CNN to maintain its predictive power post-pruning, it is crucial to retain the most significant neurons especially in the FRL, where responses directly influence the classification output. The Inf-FS algorithm is applied for feature ranking due to its proven efficiency in feature selection tasks, which assesses neuron importance considering their discrimination power in classification tasks.

Formulation and Solution

The problem is defined formally as minimizing the weighted $\ell^1$ distance between the original FRL responses and those produced by the pruned network. The authors derive an upper bound on this objective and subsequently provide a relaxed closed-form solution. This leads to the recursive propagation of neuron importance scores, thus enabling efficient layer-wide pruning across the network.

Binary Integer Problem: Formulated as:

$\arg\min_{s^*_l} \sum_{m=1}^M \mathcal{F}(s^*_l | x^{(m)}_l, s_n; G^{(l+1,n)})$

where G represents the sub-network of layers, this optimization problem seeks to identify the least important neurons that can be pruned.

Closed-Form Solution: Importance scores s_{k} for layer k can be recursively derived as:

$s_k = |W^{(k+1)}|^\intercal s_{k+1}$

This propagation is dependent on the absolute value of weight matrices, ensuring that importance is distributed according to connectivity and impact on subsequent layers.

Evaluation and Results

Experimental Setup: NISP's efficacy is validated on MNIST, CIFAR10, and ImageNet datasets using various CNN architectures including LeNet, Cifar-net, AlexNet, GoogLeNet, and ResNet. The model is pruned and fine-tuned from a pre-trained state, thereby ensuring minor accuracy loss.

Key Metrics: The experiments measure performance in terms of accuracy loss, reduction in floating-point operations (FLOPs), and parameter reduction. Comparisons are drawn with baseline methods including random pruning, magnitude-based pruning, and layer-by-layer pruning.

Performance: The NISP algorithm consistently outperforms these baselines, demonstrating:

Faster convergence and smaller initial accuracy degradation post-pruning.
Significant reduction in computational overhead (FLOPs) without substantial loss in predictive accuracy (e.g., only 1.43% accuracy loss on AlexNet with a 67.85% reduction in FLOPs).

Theoretical and Practical Implications

The proposed NISP algorithm addresses critical challenges in deep network pruning by providing a scalable and theoretically substantiated approach that can be easily applied across different neural network architectures. The framework:

Ensures consideration of neuron importance influenced by global network behavior rather than limited local statistics.
Is versatile, allowing feature ranking in any layer of interest and propagation across various types of layers including fully connected, convolutional, and pooling layers.

Future Perspectives

Given the promising results, future research could extend NISP in several ways:

Adaptation for specialized hardware accelerations such as FPGA or ASIC where memory and computational efficiencies are paramount.
Application to other domains including Recurrent Neural Networks (RNNs) and Transformer models to explore its generalizability and efficacy in diverse deep learning contexts.

Conclusion

The NISP algorithm as presented provides an elegant and effective means to prune deep neural networks by leveraging neuron importance propagation. This approach mitigates reconstruction errors and maintains model performance, significantly optimizing computational resources. It stands as a robust framework well-primed for integration into various CNN architectures, effectively scaling deep learning models to be more resource-efficient while retaining their predictive capabilities.

PDF Markdown

Related Papers

Find Related Papers