Designing Energy-Efficient Convolutional Neural Networks using Energy-Aware Pruning (1611.05128v4)

Published 16 Nov 2016 in cs.CV

Abstract: Deep convolutional neural networks (CNNs) are indispensable to state-of-the-art computer vision algorithms. However, they are still rarely deployed on battery-powered mobile devices, such as smartphones and wearable gadgets, where vision algorithms can enable many revolutionary real-world applications. The key limiting factor is the high energy consumption of CNN processing due to its high computational complexity. While there are many previous efforts that try to reduce the CNN model size or amount of computation, we find that they do not necessarily result in lower energy consumption, and therefore do not serve as a good metric for energy cost estimation. To close the gap between CNN design and energy consumption optimization, we propose an energy-aware pruning algorithm for CNNs that directly uses energy consumption estimation of a CNN to guide the pruning process. The energy estimation methodology uses parameters extrapolated from actual hardware measurements that target realistic battery-powered system setups. The proposed layer-by-layer pruning algorithm also prunes more aggressively than previously proposed pruning methods by minimizing the error in output feature maps instead of filter weights. For each layer, the weights are first pruned and then locally fine-tuned with a closed-form least-square solution to quickly restore the accuracy. After all layers are pruned, the entire network is further globally fine-tuned using back-propagation. With the proposed pruning method, the energy consumption of AlexNet and GoogLeNet are reduced by 3.7x and 1.6x, respectively, with less than 1% top-5 accuracy loss. Finally, we show that pruning the AlexNet with a reduced number of target classes can greatly decrease the number of weights but the energy reduction is limited. Energy modeling tool and energy-aware pruned models available at http://eyeriss.mit.edu/energy.html

Citations (716)

View on Semantic Scholar

Summary

The paper introduces an energy-aware pruning algorithm that directly minimizes energy use by targeting high-energy layers in CNNs.
It develops a novel energy estimation methodology accounting for computation, memory access, and data reuse to guide the pruning process.
Results on AlexNet and GoogLeNet show energy reductions of up to 3.7× with minimal accuracy loss, enabling efficient deployment on battery-powered devices.

Designing Energy-Efficient Convolutional Neural Networks using Energy-Aware Pruning

The paper "Designing Energy-Efficient Convolutional Neural Networks using Energy-Aware Pruning" by Tien-Ju Yang, Yu-Hsin Chen, and Vivienne Sze presents a novel approach to minimizing the energy consumption of convolutional neural networks (CNNs). This is particularly relevant for deploying CNNs on battery-powered mobile devices such as smartphones and wearable gadgets.

Summary of Contributions

The authors identify that conventional methods to reduce CNN model size or computation do not always lead to lower energy consumption. Addressing this issue, the paper proposes an energy-aware pruning algorithm that directly uses energy consumption as a metric to guide the pruning process. The core contributions of the work include:

Energy Estimation Methodology:
- A new framework is designed to estimate the energy consumption of CNNs, considering both computation and memory accesses.
- Parameters are extrapolated from actual hardware measurements.
- The energy estimation tool and methodology take into account data sparsity and bitwidth reduction.
Energy-Aware Pruning Algorithm:
- A novel layer-by-layer pruning method is employed that minimizes changes in output feature maps rather than filter weights.
- This method aggressively prunes layers starting from those that consume the most energy.
- Each layer's weights are first pruned and then locally fine-tuned using a closed-form least-square solution.
- After layer-specific pruning, the entire network is globally fine-tuned using back-propagation.
Comprehensive Evaluation:
- The proposed method significantly reduces the energy consumption of AlexNet and GoogLeNet while maintaining high accuracy.
- Comparative analysis illustrates that the energy-aware pruned models are more efficient than those pruned with existing methods.

Key Results

The effectiveness of the proposed pruning method is demonstrated on CNN architectures like AlexNet, GoogLeNet, and SqueezeNet. Notable results include:

AlexNet: Energy consumption reduced by 3.7× with less than 1% top-5 accuracy loss.
GoogLeNet: Energy consumption reduced by 1.6× with less than 1% top-5 accuracy loss.
SqueezeNet: The proposed method leads to better results in terms of weights, MAC operations, and energy consumption compared to the prune methods described in \cite{iclr2016-han-deep_comp}.

Insights and Implications

The paper provides several crucial insights:

Energy Consumption Components: Conv layers dominate energy consumption in CNNs due to the intensive data movement required for feature maps.
Model Size vs. Energy: There is no direct correlation between the number of weights or operations in a CNN and its actual energy consumption.
Impact of Data Reuse: Energy consumption is significantly influenced by the degree of data reuse across the memory hierarchy.
Feature Map Considerations: Efficient design for energy consumption must consider feature map movement, not just weight or operation reduction.

The methodology and results shown in this paper have notable implications:

Practical Deployment:
- The insights and techniques can be applied to deploy more energy-efficient CNNs on mobile and wearable devices.
- Real-world applications can benefit from prolonged battery life without substantial loss of CNN accuracy.
Future Research Directions:
- Further research can explore combining energy-aware pruning with other model optimization techniques like bitwidth reduction or weight sharing.
- Investigation into diverse CNN architectures and their energy profiles could lead to broader applications of the proposed techniques.

Conclusion

This paper presents a robust method for improving the energy efficiency of CNNs through an innovative pruning approach that directly targets energy consumption metrics. The authors successfully demonstrate the substantial energy savings that can be achieved while maintaining high accuracy, making significant strides towards the practical deployment of CNNs on energy-constrained devices. The work opens up new avenues for research in energy-efficient AI model design and optimization.