Filter Sketch for Network Pruning (2001.08514v4)

Published 23 Jan 2020 in cs.CV

Abstract: We propose a novel network pruning approach by information preserving of pre-trained network weights (filters). Network pruning with the information preserving is formulated as a matrix sketch problem, which is efficiently solved by the off-the-shelf Frequent Direction method. Our approach, referred to as FilterSketch, encodes the second-order information of pre-trained weights, which enables the representation capacity of pruned networks to be recovered with a simple fine-tuning procedure. FilterSketch requires neither training from scratch nor data-driven iterative optimization, leading to a several-orders-of-magnitude reduction of time cost in the optimization of pruning. Experiments on CIFAR-10 show that FilterSketch reduces 63.3% of FLOPs and prunes 59.9% of network parameters with negligible accuracy cost for ResNet-110. On ILSVRC-2012, it reduces 45.5% of FLOPs and removes 43.0% of parameters with only 0.69% accuracy drop for ResNet-50. Our code and pruned models can be found at https://github.com/lmbxmu/FilterSketch.

Citations (72)

View on Semantic Scholar

Summary

Filter Sketch for Network Pruning

The paper "Filter Sketch for Network Pruning" introduces a novel approach to optimize convolutional neural networks (CNNs) by reducing their complexity through an innovative form of network pruning termed FilterSketch. The central premise of this approach is to preserve information during the pruning process, characterized as a matrix sketch problem. This method contrasts with traditional pruning techniques that often discard less important filters without retaining significant information about the network's structure.

Methodology Overview

FilterSketch operates by encoding the second-order information of the pre-trained weights within a CNN. This is achieved through the Frequent Direction method, which efficiently approximates the covariance of filter matrices without iterative data-driven optimization. By preserving this covariance information, FilterSketch allows for the representation capacity of the pruned network to be recovered with minimal accuracy loss following a straightforward fine-tuning procedure.

This approach significantly curtails the computational cost associated with usual network pruning processes, which often require training from scratch or layer-wise optimization. The paper reports substantial reductions in both FLOPs and network parameters across various tested architectures, such as ResNet-50 and ResNet-110, when applied to datasets like CIFAR-10 and ILSVRC-2012.

Numerical Results and Implications

The proposed FilterSketch method demonstrates remarkable numerical results in pruning networks while maintaining accuracy. On CIFAR-10, the effectiveness of FilterSketch is exemplified by reducing 63.3% of FLOPs and 59.9% of network parameters for ResNet-110 with merely a 0.06% accuracy drop. Similarly, on ILSVRC-2012, the approach achieves a reduction of 45.5% FLOPs and removes 43% of parameters with only a 0.69% accuracy drop for ResNet-50. These empirical findings underline both the practical feasibility and efficiency of the approach—essentially offering a fast path to deep CNN compression.

Theoretical and Practical Impact

Theoretically, FilterSketch posits a shift in perspective for network pruning by emphasizing the preservation of second-order statistical information rather than just focusing on first-order metrics like weights’ magnitude. This proposes a potential enhancement in fine-tuning pruned models, thereby facilitating better recovery in performance post-compression. As the first attempt at incorporating weight information preservation into pruning, this approach sets a precedent for future research endeavors in structured pruning methods.

Practically, the drastically reduced computational load introduces new prospects for deploying complex CNN models on devices with constrained computational capabilities, such as mobile phones and embedded systems. This could foster innovations in areas where computational resources are limited but high accuracy and efficiency are pivotal.

Future Directions

The exploration of information-preserving network pruning opens several avenues for future development. Incorporating similar methodologies within other network architectures, such as transformers or recurrent neural networks, might yield comparable benefits in computational efficiency. Additionally, expanding this approach to integrate dynamic pruning strategies during training could further decrease the computational overhead and enhance adaptability in ever-changing feature space.

In conclusion, FilterSketch provides a robust framework for efficiently pruning neural networks while maintaining significant aspects of their representational integrity. This approach redefines the boundaries of network simplification and lays the groundwork for further exploration into efficient deep learning model deployment without sacrificing accuracy.