BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks (1709.01686v1)

Published 6 Sep 2017 in cs.NE, cs.CV, and cs.LG

Abstract: Deep neural networks are state of the art methods for many learning tasks due to their ability to extract increasingly better features at each network layer. However, the improved performance of additional layers in a deep network comes at the cost of added latency and energy usage in feedforward inference. As networks continue to get deeper and larger, these costs become more prohibitive for real-time and energy-sensitive applications. To address this issue, we present BranchyNet, a novel deep network architecture that is augmented with additional side branch classifiers. The architecture allows prediction results for a large portion of test samples to exit the network early via these branches when samples can already be inferred with high confidence. BranchyNet exploits the observation that features learned at an early layer of a network may often be sufficient for the classification of many data points. For more difficult samples, which are expected less frequently, BranchyNet will use further or all network layers to provide the best likelihood of correct prediction. We study the BranchyNet architecture using several well-known networks (LeNet, AlexNet, ResNet) and datasets (MNIST, CIFAR10) and show that it can both improve accuracy and significantly reduce the inference time of the network.

Citations (1,017)

View on Semantic Scholar

Summary

The paper introduces a novel BranchyNet architecture that uses early exit branches in DNNs to significantly reduce inference time and energy consumption.
The joint training method optimizes loss functions at multiple exits, effectively regularizing the network and preventing overfitting.
Empirical results on models like LeNet, AlexNet, and ResNet show speedups up to 5.4x with minimal accuracy trade-offs.

BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks

Introduction

The paper "BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks" introduces a significant architectural innovation aimed at optimizing the trade-off between network depth and inference efficiency. As deep neural networks (DNNs) achieve remarkable success in various learning tasks, the need for reduced latency and energy consumption in real-time applications becomes critical. The authors propose BranchyNet, an architecture that incorporates early exit branches into standard DNNs, allowing for faster inference by classifying simpler samples at intermediate layers.

BranchyNet Architecture

BranchyNet enhances conventional DNNs by embedding exit branches at strategic points within the network layers. These branches enable certain samples to exit the network early if they can be classified with high confidence. This not only saves computational resources but also mitigates the latency issues associated with processing each data sample through all layers of a deep network. Each exit branch is designed to have one or more convolutional and fully-connected layers, facilitating early classification without compromising overall network performance.

Training and Inference

The training process for BranchyNet involves a joint optimization of the loss functions from all exit points. This method ensures that early exits provide regularization, thereby preventing overfitting and enhancing feature discriminativeness in lower layers. The inference process utilizes entropy measures to determine the confidence levels at each branch point, leveraging these thresholds to decide whether a sample should exit early.

Key Contributions

Fast Inference through Early Exits: BranchyNet exits the majority of samples at earlier layers, thus significantly reducing runtime and energy consumption during inference.
Effective Regularization: The architecture benefits from joint optimization of all exits, improving the network's generalization capabilities.
Mitigation of Vanishing Gradients: Earlier exit points offer more immediate gradient signals during backpropagation, aiding in the training of deeper networks.

Empirical Results

The paper evaluates BranchyNet on several established networks (LeNet, AlexNet, and ResNet) with datasets like MNIST and CIFAR-10. The results demonstrate substantial improvements:

B-LeNet: Achieves a 5.4x speedup on CPU and 4.7x on GPU with negligible accuracy loss.
B-AlexNet: Offers a 1.5x speedup on CPU and 2.4x on GPU, with a slight improvement in accuracy over the baseline.
B-ResNet: Realizes a 1.9x speedup on both CPU and GPU, maintaining competitive accuracy.

Discussion and Implications

The BranchyNet architecture, by allowing samples to exit early, presents a robust solution to the increasing costs associated with deeper networks. This approach is particularly valuable in scenarios where real-time inference and energy efficiency are paramount. Future research could explore adaptive methods for setting entropy thresholds and extend the BranchyNet architecture to other types of neural network tasks beyond classification, such as segmentation and detection.

Conclusion

BranchyNet represents a pragmatic advancement in neural network architecture, aligning the demand for accuracy with the necessity for computational and energy efficiency. Its application to well-known network structures and datasets underscores its versatility and effectiveness. Future work could enhance BranchyNet by integrating it with network compression techniques and exploring automatic threshold tuning methods to further optimize performance across diverse applications.

Future Directions

Possible future developments include:

Meta-Recognition Algorithms: To adapt entropy thresholds dynamically based on test sample characteristics.
Extended Tasks: Incorporating BranchyNet architectural principles into tasks beyond classification.
Further Optimization: Investigating deeper branches and optimal branch point placements to maximize efficiency and accuracy.

PDF Markdown

Related Papers

Tweets

https://twitter.com/juan_wsz/status/1824537879697125525