- The paper demonstrates SC-DCNN significantly reduces hardware footprint and energy usage while maintaining high network accuracy.
- It outlines a detailed design and joint optimization of core functional blocks such as inner product, pooling, and activation functions.
- Experimental evaluations reveal a throughput of 781250 images per second and an energy efficiency of 510734 images per joule with less than 1.5% inaccuracy.
SC-DCNN: A Framework for SC-based Deep Convolutional Neural Networks
The paper "SC-DCNN: Highly-Scalable Deep Convolutional Neural Network using Stochastic Computing" provides a meticulous exploration into the application of Stochastic Computing (SC) as a promising approach to implement Deep Convolutional Neural Networks (DCNNs) with reduced hardware resources and energy consumption. The authors present a comprehensive design and optimization strategy aimed at deploying DCNNs on embedded and mobile IoT devices.
SC employs a bit-stream to represent numerical values through the probability of one occurring in the stream, offering the potential to perform complex operations like additions and multiplications with minimal hardware using simple logic gates such as AND and multiplexers. The authors introduce SC-DCNN, a framework that systematically integrates SC with DCNNs to exploit remarkable reductions in hardware footprint, energy consumption, and overall scalability while maintaining comparable network accuracy.
The SC-DCNN framework is ingeniously structured in a bottom-up manner. Initially, the paper describes the development and optimization of basic functional blocks indispensable for DCNN architectures. These blocks include essential operations like inner product computations, pooling mechanisms, and activation functions. The framework proposes an innovative design for these blocks, emphasizing the potential reductions in power and hardware while maintaining performance through meticulous joint optimizations.
One significant feature of the framework is the introduction of four distinct feature extraction block designs. These designs are optimized combinations of basic function blocks, adapted for different network configurations to ensure high accuracy. The feature extraction blocks, which act as the network's backbone by connecting various function blocks, are analyzed and optimized to balance trade-offs between performance metrics like hardware cost, latency, power, and accuracy.
Moreover, weight storage is addressed with a focus on effective memory utilization. The authors propose several schemes to optimize the deployment of weights in SRAM, reducing area and power consumption without compromising network accuracy. Efficient filter-aware SRAM sharing, precise weight storage methods, and layer-wise weight precision are some techniques that contribute to a dramatic reduction in resource usage.
Experimental evaluations substantiate the framework’s efficacy in deploying LeNet5, a benchmark DCNN, using SC-DCNN. The evaluations reveal that SC-DCNN achieves a notable throughput of 781250 images per second with an energy efficiency of 510734 images per joule — striking improvements over existing general-purpose and application-specific architectures. The experiments also demonstrate that SC-DCNN can deliver these performance gains while maintaining a network inaccuracy of less than 1.5% compared to traditional software model implementations.
Conclusively, this paper sets a foundation for employing stochastic computing in the field of AI, particularly for edge devices requiring efficient DCNN implementations. It delineates that SC-DCNN is particularly potent for scenarios involving stringent resource constraints. Future developments in stochastic computing and its integration with machine learning models could potentially lead to significant advancements in AI, especially for IoT-based applications where power efficiency and hardware size are crucial considerations. The SC-DCNN framework offers a compelling alternative to current DCNN implementations, encouraging further research and development in the field of efficient neural network design.