ShiftAddNet: A Hardware-Inspired Deep Network (2010.12785v1)

Published 24 Oct 2020 in cs.LG

Abstract: Multiplication (e.g., convolution) is arguably a cornerstone of modern deep neural networks (DNNs). However, intensive multiplications cause expensive resource costs that challenge DNNs' deployment on resource-constrained edge devices, driving several attempts for multiplication-less deep networks. This paper presented ShiftAddNet, whose main inspiration is drawn from a common practice in energy-efficient hardware implementation, that is, multiplication can be instead performed with additions and logical bit-shifts. We leverage this idea to explicitly parameterize deep networks in this way, yielding a new type of deep network that involves only bit-shift and additive weight layers. This hardware-inspired ShiftAddNet immediately leads to both energy-efficient inference and training, without compromising the expressive capacity compared to standard DNNs. The two complementary operation types (bit-shift and add) additionally enable finer-grained control of the model's learning capacity, leading to more flexible trade-off between accuracy and (training) efficiency, as well as improved robustness to quantization and pruning. We conduct extensive experiments and ablation studies, all backed up by our FPGA-based ShiftAddNet implementation and energy measurements. Compared to existing DNNs or other multiplication-less models, ShiftAddNet aggressively reduces over 80% hardware-quantified energy cost of DNNs training and inference, while offering comparable or better accuracies. Codes and pre-trained models are available at https://github.com/RICE-EIC/ShiftAddNet.

PDF Abstract

An Evaluation of ShiftAddNet: A Hardware-Inspired Deep Network

The paper introduces ShiftAddNet, a novel approach to designing deep neural networks (DNNs) tailored for deployment in energy-constrained environments by eliminating the need for multiplication operations. The innovative idea behind ShiftAddNet draws inspiration from hardware design practices where multiplication is replaced with bit-shifts and additions, leading to substantial gains in computational efficiency. While traditional DNN architectures rely heavily on multiplication operations, which are resource-intensive, ShiftAddNet leverages the cost-effective nature of bit-shift and additive operations to achieve similar expressive capabilities without excessive resource demands.

Core Characteristics of ShiftAddNet

This network design is instrumental in reducing the power consumption by over 80%, compared to traditional DNN frameworks, while maintaining or improving recognition accuracy. ShiftAddNet successfully integrates bit-shift and additive layers, providing a versatile architecture capable of efficiently controlling the model's learning capacity and offering robustness against quantization, often a challenge in less energy-intensive models.

The key innovation involves the explicit re-parameterization of standard neural network layers to eliminate multiplication, replacing it with shift and additive operations. This approach results in a deep learning framework that is fundamentally more hardware-efficient, reducing both inference and training energy costs drastically without sacrificing the expressive power necessary for complex learning tasks. When evaluated against existing multiplication-less models like AdderNet and DeepShift, ShiftAddNet consistently demonstrates superiority in both energy efficiency and accuracy.

Practical and Theoretical Implications

ShiftAddNet has significant implications for deploying deep learning models in real-world applications, especially in edge computing scenarios where computational resources are limited. These low-power devices urgently require models that can deliver high performance without incurring prohibitive energy costs. By adopting a hardware-inspired design philosophy, ShiftAddNet opens new avenues in the development of edge-intelligent systems, pushing the boundaries of on-device learning efficiency.

Theoretically, ShiftAddNet establishes a benchmark in understanding the expressive capacity of multiplication-free networks. It's observed that, despite the absence of multiplications, the integration of shift and add operations provides a rich, yet efficient landscape for learning representations comparable to those achieved by traditional DNNs. This paper lays the groundwork for further exploration in designing algorithms with explicit hardware considerations, thereby bridging the gap between theoretical machine learning advancements and practical hardware implementations.

Future research directions might focus on optimizing the balance between shift and add layers to maximize model accuracy while minimizing hardware costs. Investigations could also explore the scalability of ShiftAddNet's approach across varied domains and model architectures, refining the method to ensure consistent performance gains in diverse applications.

Conclusion

ShiftAddNet represents a critical step toward overcoming the energy constraints inherent in deploying DNNs on edge devices. By repurposing hardware-centric techniques for deep learning architectures, it provides an avenue for creating models that fulfill energy and computational prerequisites while retaining high performance. The insights gleaned from this paper highlight the potential of redesigning network architectures with practical deployment considerations, marking an evolution in the dynamic field of AI and machine learning model development.

PDF Markdown Bookmark Chat (Pro)

Authors (8)

Haoran You (33 papers)
Xiaohan Chen (30 papers)
Yongan Zhang (24 papers)
Chaojian Li (34 papers)
Sicheng Li (14 papers)
Zihao Liu (36 papers)
Zhangyang Wang (374 papers)
Yingyan Lin (67 papers)

Citations (71)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - GATECH-EIC/ShiftAddNet: [NeurIPS 2020] ShiftAddNet: A Hardware-Inspired Deep Network (69 stars)