VLSI Implementation of Deep Neural Network Using Integral Stochastic Computing (1509.08972v2)

Published 29 Sep 2015 in cs.NE and cs.AR

Abstract: The hardware implementation of deep neural networks (DNNs) has recently received tremendous attention: many applications in fact require high-speed operations that suit a hardware implementation. However, numerous elements and complex interconnections are usually required, leading to a large area occupation and copious power consumption. Stochastic computing has shown promising results for low-power area-efficient hardware implementations, even though existing stochastic algorithms require long streams that cause long latencies. In this paper, we propose an integer form of stochastic computation and introduce some elementary circuits. We then propose an efficient implementation of a DNN based on integral stochastic computing. The proposed architecture has been implemented on a Virtex7 FPGA, resulting in 45% and 62% average reductions in area and latency compared to the best reported architecture in literature. We also synthesize the circuits in a 65 nm CMOS technology and we show that the proposed integral stochastic architecture results in up to 21% reduction in energy consumption compared to the binary radix implementation at the same misclassification rate. Due to fault-tolerant nature of stochastic architectures, we also consider a quasi-synchronous implementation which yields 33% reduction in energy consumption w.r.t. the binary radix implementation without any compromise on performance.

Citations (168)

View on Semantic Scholar

Summary

The paper introduces integral stochastic computing to reduce VLSI area by 45% and latency by 62% compared to existing architectures.
It achieves up to 21% lower energy consumption with 65 nm CMOS synthesis while maintaining similar misclassification rates.
Leveraging fault-tolerant designs, its quasi-synchronous implementation reduces energy by 33% and paves the way for future nanoscale technologies.

An Overview of VLSI Implementation of Deep Neural Networks Using Integral Stochastic Computing

The paper, entitled "VLSI Implementation of Deep Neural Network Using Integral Stochastic Computing," presents an innovative approach to enhancing the hardware implementation of deep neural networks (DNNs) by employing integral stochastic computing. This method is significant due to the increasing necessity for efficient hardware solutions that meet the demanding speed and power efficiency requirements of real-world applications, such as those found in machine learning and the Internet of Things (IoT).

Key Contributions and Findings

The authors tackle the challenges associated with conventional stochastic computing, notably its dependency on long bit-streams, which result in increased latency and power consumption. To address this, they propose an integer form of stochastic computation, introducing foundational circuits that incorporate this approach. The paper highlights the following major contributions:

Area and Latency Reduction: Implementing their architecture on a Virtex7 FPGA, the authors demonstrate an average reduction of 45% in area and 62% in latency compared to the most efficient known architecture. Their results indicate significant improvements in the integration of DNNs into compact and power-efficient hardware.
Energy Efficiency: Through synthesis in a 65 nm CMOS technology, the proposed integral stochastic architecture exhibits up to 21% reduction in energy consumption when compared to traditional binary radix implementations, while maintaining similar misclassification rates.
Fault-tolerance: The inherent fault-tolerant nature of stochastic architectures is leveraged to explore quasi-synchronous implementations, which yield up to 33% reductions in energy consumption without a decrease in performance. This result is particularly promising for future unreliable process technologies, such as nanoscale memristor devices.
Integral Stochastic Computation: The paper innovatively extends stochastic computing to allow integer values, enhancing the precision and range of stochastic computation. This adjustment enables operations that preserve informational integrity without requiring cumbersome additional binary-to-stochastic conversions, thereby improving processing efficiency.

Implications and Future Directions

This research provides a robust framework for the design and implementation of DNNs within highly constrained hardware environments, which is critical for advancing applications in IoT and edge computing. The proposed integral stochastic computing method aligns well with future developments in semiconductor technologies that emphasize energy efficiency and fault tolerance.

Looking ahead, the exploration of further miniaturization and integration of this approach into emerging technologies, such as quantum computing and neuromorphic circuits, could open new avenues for research. Additionally, refining the process of generating stochastic bit-streams and optimizing the architectural design for broader classes of neural networks could further enhance the applicability and efficiency of integral stochastic computing.

Conclusion

The innovative methodologies and performance advancements described in this paper mark a significant step forward in the practical realization of hardware-efficient DNNs. The integration of integral stochastic computing into VLSI technology promises to support the evolution of compact and powerful neural network systems, expanding their applicability across a broader spectrum of technological applications.

PDF Markdown