Compute-in-Memory based Neural Network Accelerators for Safety-Critical Systems: Worst-Case Scenarios and Protections (2312.06137v1)
Abstract: Emerging non-volatile memory (NVM)-based Computing-in-Memory (CiM) architectures show substantial promise in accelerating deep neural networks (DNNs) due to their exceptional energy efficiency. However, NVM devices are prone to device variations. Consequently, the actual DNN weights mapped to NVM devices can differ considerably from their targeted values, inducing significant performance degradation. Many existing solutions aim to optimize average performance amidst device variations, which is a suitable strategy for general-purpose conditions. However, the worst-case performance that is crucial for safety-critical applications is largely overlooked in current research. In this study, we define the problem of pinpointing the worst-case performance of CiM DNN accelerators affected by device variations. Additionally, we introduce a strategy to identify a specific pattern of the device value deviations in the complex, high-dimensional value deviation space, responsible for this worst-case outcome. Our findings reveal that even subtle device variations can precipitate a dramatic decline in DNN accuracy, posing risks for CiM-based platforms in supporting safety-critical applications. Notably, we observe that prevailing techniques to bolster average DNN performance in CiM accelerators fall short in enhancing worst-case scenarios. In light of this issue, we propose a novel worst-case-aware training technique named A-TRICE that efficiently combines adversarial training and noise-injection training with right-censored Gaussian noise to improve the DNN accuracy in the worst-case scenarios. Our experimental results demonstrate that A-TRICE improves the worst-case accuracy under device variations by up to 33%.
- L. Yang, Z. Yan, M. Li, H. Kwon, L. Lai, T. Krishna, V. Chandra, W. Jiang, and Y. Shi, “Co-exploration of neural architectures and heterogeneous asic accelerator designs targeting multiple tasks,” in 2020 57th ACM/IEEE Design Automation Conference (DAC). IEEE, 2020, pp. 1–6.
- Y. Sheng, J. Yang, Y. Wu, K. Mao, Y. Shi, J. Hu, W. Jiang, and L. Yang, “The larger the fairer? small neural networks can achieve fairness for edge devices,” 2022.
- A. Shafiee, A. Nag, N. Muralimanohar, R. Balasubramonian, J. P. Strachan, M. Hu, R. S. Williams, and V. Srikumar, “Isaac: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars,” ACM SIGARCH Computer Architecture News, vol. 44, no. 3, pp. 14–26, 2016.
- V. Sze, Y.-H. Chen, T.-J. Yang, and J. S. Emer, “Efficient processing of deep neural networks: A tutorial and survey,” Proceedings of the IEEE, vol. 105, no. 12, pp. 2295–2329, 2017.
- Y.-H. Chen, J. Emer, and V. Sze, “Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks,” ACM SIGARCH computer architecture news, vol. 44, no. 3, pp. 367–379, 2016.
- X. Peng, S. Huang, Y. Luo, X. Sun, and S. Yu, “Dnn+ neurosim: An end-to-end benchmarking framework for compute-in-memory accelerators with versatile device technologies,” in 2019 IEEE international electron devices meeting (IEDM). IEEE, 2019, pp. 32–5.
- Z. Yan, Y. Shi, W. Liao, M. Hashimoto, X. Zhou, and C. Zhuo, “When single event upset meets deep neural networks: Observations, explorations, and remedies,” in 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 2020, pp. 163–168.
- S. Jin, S. Pei, and Y. Wang, “On improving fault tolerance of memristor crossbar based neural network designs by target sparsifying,” in 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2020, pp. 91–96.
- T. Liu, W. Wen, L. Jiang, Y. Wang, C. Yang, and G. Quan, “A fault-tolerant neural network architecture,” in 2019 56th ACM/IEEE Design Automation Conference (DAC). IEEE, 2019, pp. 1–6.
- Z. He, J. Lin, R. Ewetz, J.-S. Yuan, and D. Fan, “Noise injection adaption: End-to-end reram crossbar non-ideal effect adaption for neural network mapping,” in Proceedings of the 56th Annual Design Automation Conference 2019, 2019, pp. 1–6.
- Z. Yan, X. S. Hu, and Y. Shi, “Swim: Selective write-verify for computing-in-memory neural accelerators,” in 2022 59th ACM/IEEE Design Automation Conference (DAC). IEEE, 2022.
- D. Gao, Q. Huang, G. L. Zhang, X. Yin, B. Li, U. Schlichtmann, and C. Zhuo, “Bayesian inference based robust computing on memristor crossbar,” in 2021 58th ACM/IEEE Design Automation Conference (DAC). IEEE, 2021, pp. 121–126.
- W. Shim, J.-s. Seo, and S. Yu, “Two-step write–verify scheme and impact of the read noise in multilevel rram-based inference engine,” Semiconductor Science and Technology, vol. 35, no. 11, p. 115026, 2020.
- Z. Yan, D.-C. Juan, X. S. Hu, and Y. Shi, “Uncertainty modeling of emerging device based computing-in-memory neural accelerators with application to neural architecture search,” in 2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 2021, pp. 859–864.
- Z. Yan, W. Jiang, X. S. Hu, and Y. Shi, “Radars: Memory efficient reinforcement learning aided differentiable neural architecture search,” in 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 2022, pp. 128–133.
- W. Jiang, Q. Lou, Z. Yan, L. Yang, J. Hu, X. S. Hu, and Y. Shi, “Device-circuit-architecture co-exploration for computing-in-memory neural accelerators,” IEEE Transactions on Computers, vol. 70, no. 4, pp. 595–605, 2020.
- P. Yao, H. Wu, B. Gao, J. Tang, Q. Zhang, W. Zhang, J. J. Yang, and H. Qian, “Fully hardware-implemented memristor convolutional neural network,” Nature, vol. 577, no. 7792, pp. 641–646, 2020.
- Z. Wang, C. Huang, and Q. Zhu, “Efficient global robustness certification of neural networks via interleaving twin-network encoding,” arXiv preprint arXiv:2203.14141, 2022.
- Y.-L. Tsai, C.-Y. Hsu, C.-M. Yu, and P.-Y. Chen, “Formalizing generalization and adversarial robustness of neural networks to weight perturbations,” Advances in Neural Information Processing Systems, vol. 34, 2021.
- Z. Yan, X. S. Hu, and Y. Shi, “Computing in memory neural network accelerators for safety-critical systems: Can small device variations be disastrous?” 2022 International Conference on Computer-Aided Design (ICCAD), 2022.
- Z. Yan, Y. Qin, X. S. Hu, and Y. Shi, “Improving realistic worst-case performance of nvcim dnn accelerators through training with right-censored gaussian noise,” 2023 International Conference on Computer-Aided Design (ICCAD), 2023.
- B. Feinberg, S. Wang, and E. Ipek, “Making memristive neural network accelerators reliable,” in 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 2018, pp. 52–65.
- D. Wu, S.-T. Xia, and Y. Wang, “Adversarial weight perturbation helps robust generalization,” Advances in Neural Information Processing Systems, vol. 33, pp. 2958–2969, 2020.
- C.-C. Chang, M.-H. Wu, J.-W. Lin, C.-H. Li, V. Parmar, H.-Y. Lee, J.-H. Wei, S.-S. Sheu, M. Suri, T.-S. Chang et al., “Nv-bnn: An accurate deep convolutional neural network based on binary stt-mram for adaptive ai edge,” in 2019 56th ACM/IEEE Design Automation Conference (DAC). IEEE, 2019, pp. 1–6.
- C.-Y. Chen and K. Chakrabarty, “Pruning of deep neural networks for fault-tolerant memristor-based accelerators,” in 2021 58th ACM/IEEE Design Automation Conference (DAC). IEEE, 2021, pp. 889–894.
- H. Shin, M. Kang, and L.-S. Kim, “Fault-free: A fault-resilient deep neural network accelerator based on realistic reram devices,” in 2021 58th ACM/IEEE Design Automation Conference (DAC). IEEE, 2021, pp. 1039–1044.
- S. Jeong, J. Kim, M. Jeong, and Y. Lee, “Variation-tolerant and low r-ratio compute-in-memory reram macro with capacitive ternary mac operation,” IEEE Transactions on Circuits and Systems I: Regular Papers, 2022.
- N. Carlini and D. Wagner, “Towards evaluating the robustness of neural networks,” in 2017 ieee symposium on security and privacy (sp). IEEE, 2017, pp. 39–57.
- Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
- L. Deng, “The mnist database of handwritten digit images for machine learning research,” IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 141–142, 2012.
- A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” 2009.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
- Y. Le and X. Yang, “Tiny imagenet visual recognition challenge,” CS 231N, vol. 7, no. 7, p. 3, 2015.
- J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009, pp. 248–255.
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.