Dynamic Network Surgery for Efficient DNNs (1608.04493v2)

Published 16 Aug 2016 in cs.NE, cs.CV, and cs.LG

Abstract: Deep learning has become a ubiquitous technology to improve machine intelligence. However, most of the existing deep models are structurally very complex, making them difficult to be deployed on the mobile platforms with limited computational power. In this paper, we propose a novel network compression method called dynamic network surgery, which can remarkably reduce the network complexity by making on-the-fly connection pruning. Unlike the previous methods which accomplish this task in a greedy way, we properly incorporate connection splicing into the whole process to avoid incorrect pruning and make it as a continual network maintenance. The effectiveness of our method is proved with experiments. Without any accuracy loss, our method can efficiently compress the number of parameters in LeNet-5 and AlexNet by a factor of $\bm{108}\times$ and $\bm{17.7}\times$ respectively, proving that it outperforms the recent pruning method by considerable margins. Code and some models are available at https://github.com/yiwenguo/Dynamic-Network-Surgery.

PDF Abstract

Dynamic Network Surgery for Efficient DNNs

"Dynamic Network Surgery for Efficient DNNs" by Yiwen Guo, Anbang Yao, and Yurong Chen presents a novel approach for compressing deep neural networks (DNNs) with the objective of maintaining accuracy while significantly reducing the number of parameters. This technique, termed dynamic network surgery, advances the state-of-the-art in network pruning by incorporating both pruning and splicing operations in a dynamic, iterative manner. This method aims to overcome the limitations of previous approaches like the one by Han et al. (2015), which primarily focused on magnitude-based, greedy pruning.

Summary of Key Concepts

The dynamic network surgery method includes two core operations:

Pruning: Removing connections deemed unimportant based on their weight magnitude.
Splicing: Re-establishing pruned connections if they are later found to be important, thus correcting potential pruning errors.

These operations are conducted continuously, driven by periodic assessment and updates of connection importance. This dynamic aspect allows for better maintenance of the network's performance while achieving significant compression.

Experimental Results

The authors validate their approach using three benchmarks: LeNet-5, LeNet-300-100 on the MNIST dataset, and AlexNet on the ImageNet dataset. Comparative results demonstrate a remarkable improvement over existing methods in terms of both compression rates and computational efficiency.

LeNet-5: The number of parameters was reduced by 108 times without an increase in the prediction error rate of 0.91%.
LeNet-300-100: Achieved a 56 times reduction in the number of parameters, improving the error rate slightly from 2.28% to 1.99%.
AlexNet: Compressed by a factor of 17.7 times with a top-1 accuracy drop of merely 0.33% (43.09% vs. 43.42%).

In particular, the comparison with Han et al.'s method highlighted that dynamic network surgery not only offered higher compression rates but also required significantly fewer training iterations to achieve these results.

Theoretical and Practical Implications

The dynamic network surgery approach introduces several implications for both theory and practice in DNN optimization:

Theoretical Implications

Parameter Importance Adaptation: The continual adaptation of parameter importance reduces the risk of irretrievable network damage that static pruning methods suffer from. This theoretical advancement promotes more robust and flexible model compression.
Dynamic Maintenance: The circular procedure of pruning and splicing can be viewed as an ongoing optimization process akin to biological systems, offering new perspectives on dynamic adaptations in artificial neural networks.

Practical Implications

Deployment Efficiency: The significant reduction in the number of parameters directly translates to lower storage requirements and faster runtimes, particularly beneficial for deploying DNNs on resource-constrained devices like mobile phones.
Training Efficiency: The reduced need for retraining iterations enhances the overall efficiency of model training and updating, easing the computational burden associated with maintaining up-to-date models.

Future Directions

Future research based on dynamic network surgery could explore several promising directions:

Generalization to Other Architectures: Extending the technique to more complex models, such as Transformer architectures in NLP tasks, to investigate its versatility and robustness across various domains.
Automated Threshold Determination: Developing more sophisticated methods for automatically setting pruning and splicing thresholds could further enhance the model's adaptability and performance.
Hardware Optimization: Investigating the implications of dynamic network surgery on specialized hardware, for instance, TPUs or FPGAs, to maximize inference efficiency and battery life in embedded systems.

In conclusion, dynamic network surgery represents a significant step forward in the efficient compression of deep neural networks. This method not only addresses the inefficiencies inherent in previous pruning techniques but also introduces a flexible, dynamic approach that ensures high performance with reduced computational overhead. This research paves the way for more scalable and deployable AI systems, particularly in scenarios where computational resources are limited.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Yiwen Guo (58 papers)
Anbang Yao (33 papers)
Yurong Chen (43 papers)

Citations (1,024)

View on Semantic Scholar

Dynamic Network Surgery for Efficient DNNs (1608.04493v2)