XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks (1603.05279v4)

Published 16 Mar 2016 in cs.CV

Abstract: We propose two efficient approximations to standard convolutional neural networks: Binary-Weight-Networks and XNOR-Networks. In Binary-Weight-Networks, the filters are approximated with binary values resulting in 32x memory saving. In XNOR-Networks, both the filters and the input to convolutional layers are binary. XNOR-Networks approximate convolutions using primarily binary operations. This results in 58x faster convolutional operations and 32x memory savings. XNOR-Nets offer the possibility of running state-of-the-art networks on CPUs (rather than GPUs) in real-time. Our binary networks are simple, accurate, efficient, and work on challenging visual tasks. We evaluate our approach on the ImageNet classification task. The classification accuracy with a Binary-Weight-Network version of AlexNet is only 2.9% less than the full-precision AlexNet (in top-1 measure). We compare our method with recent network binarization methods, BinaryConnect and BinaryNets, and outperform these methods by large margins on ImageNet, more than 16% in top-1 accuracy.

Citations (4,211)

View on Semantic Scholar

Summary

The paper introduces two binary approximations—Binary-Weight Networks and XNOR-Nets—that binarize CNN weights and activations for computational efficiency.
The paper demonstrates a 32× memory reduction and a 58× speedup on CPUs, achieving over 16% improvement in top-1 accuracy on ImageNet compared to previous methods.
The paper provides an efficient training framework that enables high-performance binary CNNs to be deployed on resource-constrained devices.

Overview of "XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks"

The paper "XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks" by Rastegari et al. introduces two efficient approximations to standard convolutional neural networks (CNNs): Binary-Weight Networks (BWN) and XNOR-Networks (XNOR-Net). These approaches aim to significantly reduce the computational cost and memory requirements of CNNs while retaining high accuracy on image classification tasks.

Key Contributions

Binary-Weight Networks (BWN):
- In BWN, the weights of the CNN are binarized, resulting in approximately 32× memory savings.
- Binarization of weights enables the use of simple arithmetic operations (addition and subtraction) instead of multiplications, leading to a modest 2× speedup in convolution operations.
XNOR-Networks:
- XNOR-Networks extend the concept of binary approximations to both weights and input activations.
- This enables convolution operations to be approximated using XNOR and bit-counting operations, achieving a substantial 58× speedup in high-precision operations on CPUs alongside 32× memory savings.

Experimental Results

ImageNet Classification:
- The paper evaluates the performance of BWN and XNOR-Nets on the ImageNet classification task using the AlexNet architecture.
- The Binary-Weight-Network version of AlexNet achieves classification accuracy comparable to that of the full-precision AlexNet.
- In comparison to recent binarization methods such as BinaryConnect and BinaryNet, XNOR-Nets outperform them significantly, with more than a 16% improvement in top-1 accuracy.

Implementation and Training

Binarization Technique:
- The binary weights are determined by optimizing $\alpha$ and $B$ such that $W \approx \alpha B$ , where $W$ represents the full-precision weights, $\alpha$ is a scaling factor, and $B$ consists of binary values.
- The optimal values of $\alpha$ and $B$ are derived using the $\ell_1$ norm of the full-precision weights.
Architecture Design:
- Traditional CNN architecture blocks are modified to include binary activations and binarized convolutions.
- The paper introduces an efficient forward and backward propagation procedure to train these binary networks from scratch.

Implications and Future Directions

The methods proposed in this paper have significant implications for deploying deep learning models on resource-constrained devices such as smartphones and embedded systems. The reduction in memory and computation requirements allows for real-time inference on CPUs without the need for expensive GPUs, making state-of-the-art computer vision techniques more accessible.

The practical applications of this research are far-reaching, potentially impacting fields that rely on portable and low-power devices, including augmented reality, autonomous driving, and various IoT applications.

Theoretical and Practical Considerations

The theoretical contribution of this work lies in demonstrating that CNNs can be effectively binarized without a substantial loss in accuracy. This challenges the traditional reliance on high-precision computations in neural networks and opens up new avenues for efficient deep learning model design.

Practically, the implementation details provided for the BWN and XNOR-Nets offer a roadmap for developing and deploying efficient neural network architectures. The use of scaling factors and optimal binarization strategies ensures that the benefits of binarization are maximized without significant compromises in model performance.

In conclusion, the methods proposed in this paper represent an important step towards more efficient and accessible deep learning models. Future research could explore further optimizations and extensions of this approach, such as multi-bit quantization techniques, to achieve even better trade-offs between accuracy, memory usage, and computational efficiency.

PDF Markdown

Related Papers

YouTube

Show All Videos