Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Accurate deep neural network inference using computational phase-change memory (1906.03138v2)

Published 7 Jun 2019 in cs.ET

Abstract: In-memory computing is a promising non-von Neumann approach for making energy-efficient deep learning inference hardware. Crossbar arrays of resistive memory devices can be used to encode the network weights and perform efficient analog matrix-vector multiplications without intermediate movements of data. However, due to device variability and noise, the network needs to be trained in a specific way so that transferring the digitally trained weights to the analog resistive memory devices will not result in significant loss of accuracy. Here, we introduce a methodology to train ResNet-type convolutional neural networks that results in no appreciable accuracy loss when transferring weights to in-memory computing hardware based on phase-change memory (PCM). We also propose a compensation technique that exploits the batch normalization parameters to improve the accuracy retention over time. We achieve a classification accuracy of 93.7% on the CIFAR-10 dataset and a top-1 accuracy on the ImageNet benchmark of 71.6% after mapping the trained weights to PCM. Our hardware results on CIFAR-10 with ResNet-32 demonstrate an accuracy above 93.5% retained over a one day period, where each of the 361,722 synaptic weights of the network is programmed on just two PCM devices organized in a differential configuration.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Vinay Joshi (8 papers)
  2. Manuel Le Gallo (33 papers)
  3. Simon Haefeli (1 paper)
  4. Irem Boybat (22 papers)
  5. S. R. Nandakumar (7 papers)
  6. Christophe Piveteau (13 papers)
  7. Martino Dazzi (9 papers)
  8. Bipin Rajendran (50 papers)
  9. Abu Sebastian (67 papers)
  10. Evangelos Eleftheriou (23 papers)
Citations (322)

Summary

  • The paper presents an optimized training method that injects stochastic noise into DNN weights to mitigate inherent inaccuracies in phase-change memory.
  • It demonstrates near-software-level performance with 93.7% accuracy on CIFAR-10 and 71.6% top-1 accuracy on ImageNet using PCM chips.
  • The approach leverages drift compensation techniques to maintain high inference accuracy over time under varying environmental conditions.

Accurate Deep Neural Network Inference Using Computational Phase-Change Memory

The paper presents an innovative approach to improving the accuracy of deploying deep neural networks (DNNs) using analog in-memory computational hardware, particularly focusing on phase-change memory (PCM). The researchers introduce a methodology that optimizes the training phase of DNNs to mitigate the inherent inaccuracies when weights are transferred to PCM synapses. This is achieved without introducing significant loss of accuracy, demonstrating effective application on prominent models such as ResNet for the CIFAR-10 and ImageNet datasets.

The core proposition emphasizes training ResNet-type convolutional neural networks with a methodology that incorporates stochastic noise injection into the synaptic weights during training, enabling the network to be naturally robust against the hardware-induced noise and variations typical of PCM devices. Notably, the authors report that this training strategy, combined with a noise-compensation approach leveraging batch normalization parameters, effectively retains the accuracy of DNN inference on hardware. For instance, they achieve a 93.7% accuracy on CIFAR-10 and a top-1 accuracy of 71.6% on ImageNet post-mapping of weights to PCM, surpassing previous demonstrations in maintaining near-software-level performance over an extended period.

The experimentation involves using a prototype multi-level PCM chip to empirically validate the effectiveness of the proposed training and deployment methodologies. The paper details that with 723,444 PCM devices programmed for ResNet-32's synaptic weights, accuracy degradation was managed to within industry-standard measures over a prolonged duration. Specifically, the accuracy remains above 92.6% over one day with varying environmental conditions due to applied drift correction techniques such as Global Drift Compensation (GDC) and Adaptive Batch Normalization Statistics update (AdaBS).

The implications of this paper are twofold. Practically, it outlines a viable path for deploying energy-efficient and high-fidelity DNN inference engines that significantly reduce the computational overhead and latency associated with traditional von Neumann architectures. Theoretically, this work reinforces the significance of adaptive training methodologies and the fusion of machine learning with innovative hardware design, paving the path for more advanced and scalable implementations of analog in-memory computing systems.

Future work could expand on integrating multi-device encoding per synaptic weight to further boost performance, albeit at a trade-off with resource requirement overheads. Furthermore, exploring more complex DNN architectures and additional PCM drift-compensating methodologies could yield better results, potentially broadening applications across more resource-constrained and edge-based AI deployments. The adaptability of the proposed training method across different resistive memory types further suggests promising prospects for widespread adoption.

In summary, this research marks a significant advancement towards realizing efficient deep learning inference using computational memory, overcoming challenges that have long deterred the practical adoption of resistive memories in AI hardware solutions.