- The paper presents an optimized training method that injects stochastic noise into DNN weights to mitigate inherent inaccuracies in phase-change memory.
- It demonstrates near-software-level performance with 93.7% accuracy on CIFAR-10 and 71.6% top-1 accuracy on ImageNet using PCM chips.
- The approach leverages drift compensation techniques to maintain high inference accuracy over time under varying environmental conditions.
Accurate Deep Neural Network Inference Using Computational Phase-Change Memory
The paper presents an innovative approach to improving the accuracy of deploying deep neural networks (DNNs) using analog in-memory computational hardware, particularly focusing on phase-change memory (PCM). The researchers introduce a methodology that optimizes the training phase of DNNs to mitigate the inherent inaccuracies when weights are transferred to PCM synapses. This is achieved without introducing significant loss of accuracy, demonstrating effective application on prominent models such as ResNet for the CIFAR-10 and ImageNet datasets.
The core proposition emphasizes training ResNet-type convolutional neural networks with a methodology that incorporates stochastic noise injection into the synaptic weights during training, enabling the network to be naturally robust against the hardware-induced noise and variations typical of PCM devices. Notably, the authors report that this training strategy, combined with a noise-compensation approach leveraging batch normalization parameters, effectively retains the accuracy of DNN inference on hardware. For instance, they achieve a 93.7% accuracy on CIFAR-10 and a top-1 accuracy of 71.6% on ImageNet post-mapping of weights to PCM, surpassing previous demonstrations in maintaining near-software-level performance over an extended period.
The experimentation involves using a prototype multi-level PCM chip to empirically validate the effectiveness of the proposed training and deployment methodologies. The paper details that with 723,444 PCM devices programmed for ResNet-32's synaptic weights, accuracy degradation was managed to within industry-standard measures over a prolonged duration. Specifically, the accuracy remains above 92.6% over one day with varying environmental conditions due to applied drift correction techniques such as Global Drift Compensation (GDC) and Adaptive Batch Normalization Statistics update (AdaBS).
The implications of this paper are twofold. Practically, it outlines a viable path for deploying energy-efficient and high-fidelity DNN inference engines that significantly reduce the computational overhead and latency associated with traditional von Neumann architectures. Theoretically, this work reinforces the significance of adaptive training methodologies and the fusion of machine learning with innovative hardware design, paving the path for more advanced and scalable implementations of analog in-memory computing systems.
Future work could expand on integrating multi-device encoding per synaptic weight to further boost performance, albeit at a trade-off with resource requirement overheads. Furthermore, exploring more complex DNN architectures and additional PCM drift-compensating methodologies could yield better results, potentially broadening applications across more resource-constrained and edge-based AI deployments. The adaptability of the proposed training method across different resistive memory types further suggests promising prospects for widespread adoption.
In summary, this research marks a significant advancement towards realizing efficient deep learning inference using computational memory, overcoming challenges that have long deterred the practical adoption of resistive memories in AI hardware solutions.