Concurrent Self-testing of Neural Networks Using Uncertainty Fingerprint (2401.01458v1)
Abstract: Neural networks (NNs) are increasingly used in always-on safety-critical applications deployed on hardware accelerators (NN-HAs) employing various memory technologies. Reliable continuous operation of NN is essential for safety-critical applications. During online operation, NNs are susceptible to single and multiple permanent and soft errors due to factors such as radiation, aging, and thermal effects. Explicit NN-HA testing methods cannot detect transient faults during inference, are unsuitable for always-on applications, and require extensive test vector generation and storage. Therefore, in this paper, we propose the \emph{uncertainty fingerprint} approach representing the online fault status of NN. Furthermore, we propose a dual head NN topology specifically designed to produce uncertainty fingerprints and the primary prediction of the NN in \emph{a single shot}. During the online operation, by matching the uncertainty fingerprint, we can concurrently self-test NNs with up to $100\%$ coverage with a low false positive rate while maintaining a similar performance of the primary task. Compared to existing works, memory overhead is reduced by up to $243.7$ MB, multiply and accumulate (MAC) operation is reduced by up to $10000\times$, and false-positive rates are reduced by up to $89\%$.
- C. Szegedy, “An overview of deep learning,” AITP 2016, p. 7, 2016.
- K. He et al., “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
- M. Bojarski et al., “End to end learning for self-driving cars,” arXiv preprint arXiv:1604.07316, 2016.
- K. D. Julian, M. J. Kochenderfer, and M. P. Owen, “Deep neural network compression for aircraft collision avoidance systems,” Journal of Guidance, Control, and Dynamics, vol. 42, no. 3, pp. 598–608, 2019.
- Q. Deng et al., “Dracc: A dram based accelerator for accurate cnn inference,” in Proceedings of the 55th Annual Design Automation Conference, ser. DAC ’18. New York, NY, USA: Association for Computing Machinery, 2018. [Online]. Available: https://doi.org/10.1145/3195970.3196029
- H. Jiang, R. Liu, and S. Yu, “8t xnor-sram based parallel compute-in-memory for deep neural network accelerator,” in 2020 IEEE 63rd International Midwest Symposium on Circuits and Systems (MWSCAS), 2020, pp. 257–260.
- M. Donato et al., “On-chip deep neural network storage with multi-level envm,” in Proceedings of the 55th Annual Design Automation Conference, 2018, pp. 1–6.
- L. Song et al., “Pipelayer: A pipelined reram-based accelerator for deep learning,” in 2017 IEEE international symposium on high performance computer architecture (HPCA). IEEE, 2017, pp. 541–552.
- M. Le Gallo et al., “Mixed-precision in-memory computing,” Nature Electronics, vol. 1, no. 4, pp. 246–253, 2018.
- Y. Pan et al., “A multilevel cell stt-mram-based computing in-memory accelerator for binary convolutional neural network,” IEEE Transactions on Magnetics, vol. 54, no. 11, pp. 1–5, 2018.
- C. Torres-Huitzil and B. Girau, “Fault and error tolerance in neural networks: A review,” IEEE Access, vol. 5, pp. 17 322–17 341, 2017.
- E. Ozen and A. Orailoglu, “Shaping resilient ai hardware through dnn computational feature exploitation,” IEEE Design & Test, vol. 40, no. 2, pp. 59–66, 2022.
- C. Münch, R. Bishnoi, and M. B. Tahoori, “Tolerating retention failures in neuromorphic fabric based on emerging resistive memories,” in 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 2020, pp. 393–400.
- L. Xia et al., “Fault-tolerant training with on-line fault detection for rram-based neural computing systems,” in Proceedings of the 54th Annual Design Automation Conference 2017, 2017, pp. 1–6.
- W. Li et al., “Rramedy: Protecting reram-based neural network from permanent and soft faults during its lifetime,” in 2019 IEEE 37th International Conference on Computer Design (ICCD). IEEE, 2019, pp. 91–99.
- S. T. Ahmed and M. B. Tahoori, “Compact functional test generation for memristive deep learning implementations using approximate gradient ranking,” in 2022 IEEE International Test Conference (ITC), 2022, pp. 239–248.
- B. Luo et al., “On functional test generation for deep neural network ips,” in 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2019, pp. 1010–1015.
- D. Roberts and P. Nair, “Faultsim: A fast, configurable memory-resilience simulator,” in The Memory Forum: In conjunction with ISCA, vol. 41. Citeseer, 2014.
- M. Gottscho et al., “Measuring the impact of memory errors on application performance,” IEEE Computer Architecture Letters, vol. 16, no. 1, pp. 51–55, 2016.
- C.-Y. Chen and K. Chakrabarty, “On-line functional testing of memristor-mapped deep neural networks using backdoored checksums,” in 2021 IEEE ITC, 2021.
- F. Meng, F. S. Hosseini, and C. Yang, “A self-test framework for detecting fault-induced accuracy drop in neural network accelerators,” in Proceedings of the 26th Asia and South Pacific Design Automation Conference, 2021, pp. 722–727.
- F. Meng and C. Yang, “Exploring image selection for self-testing in neural network accelerators,” in 2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI). IEEE, 2022, pp. 345–350.
- Q. Liu et al., “Monitoring the health of emerging neural network accelerators with cost-effective concurrent test,” in 2020 57th ACM/IEEE Design Automation Conference (DAC). IEEE, 2020, pp. 1–6.
- F. F. dos Santos et al., “Analyzing and increasing the reliability of convolutional neural networks on gpus,” IEEE Transactions on Reliability, vol. 68, no. 2, pp. 663–677, 2018.
- M. Liu et al., “Online fault detection in reram-based computing systems for inferencing,” IEEE Trans. on VLSI Systems, vol. 30, no. 4, pp. 392–405, 2022.
- G. Gavarini et al., “Open-set recognition: an inexpensive strategy to increase dnn reliability,” in 2022 IEEE 28th International Symposium on On-Line Testing and Robust System Design (IOLTS). IEEE, 2022, pp. 1–7.
- Z. Zheng et al., “Preventing zero-shot transfer degradation in continual learning of vision-language models,” arXiv preprint arXiv:2303.06628, 2023.
- S. S. Mukherjee, J. Emer, and S. K. Reinhardt, “The soft error problem: An architectural perspective,” in 11th International Symposium on High-Performance Computer Architecture. IEEE, 2005, pp. 243–247.
- K. B. Ferreira et al., “Extra bits on sram and dram errors-more data from the field.” Sandia National Lab.(SNL-NM), Albuquerque, NM (United States), Tech. Rep., 2014.
- R. Bishnoi et al., “Read disturb fault detection in stt-mram,” in 2014 International Test Conference. IEEE, 2014, pp. 1–7.
- R. Salkhordeh and H. Asadi, “An operating system level data migration scheme in hybrid dram-nvm memory architecture,” in 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2016, pp. 936–941.
- M. Donato et al., “Memti: Optimizing on-chip nonvolatile storage for visual multitask inference at the edge,” IEEE Micro, vol. 39, no. 6, pp. 73–81, 2019.
- Y. Gal and Z. Ghahramani, “Dropout as a bayesian approximation: Representing model uncertainty in deep learning,” in international conference on machine learning. PMLR, 2016, pp. 1050–1059.
- D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013.
- B. Lakshminarayanan, A. Pritzel, and C. Blundell, “Simple and scalable predictive uncertainty estimation using deep ensembles,” NeurIPS, 2017.
- X. Jia et al., “Efficient Computation Reduction in Bayesian Neural Networks Through Feature Decomposition and Memorization,” IEEE Trans. on Neural Networks and Learning Systems, vol. 32, Apr. 2021.
- A. Malhotra et al., “Exploiting Oxide Based Resistive RAM Variability for Bayesian Neural Network Hardware Design,” IEEE Transactions on Nanotechnology, vol. 19, pp. 328–331, 2020, conference Name: IEEE Transactions on Nanotechnology.
- S. T. Ahmed et al., “SpinBayes: Algorithm-Hardware Co-Design for Uncertainty Estimation Using Bayesian In-Memory Approximation on Spintronic-Based Architectures,” ACM Transactions on Embedded Computing Systems, vol. 22, no. 5s, pp. 131:1–131:25, Sep. 2023. [Online]. Available: https://doi.org/10.1145/3609116
- ——, “Spindrop: Dropout-based bayesian binary neural networks with spintronic implementation,” IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 13, no. 1, pp. 150–164, 2023.
- S. Yoshikiyo et al., “Nn algorithm aware alternate layer retraining on computation-in-memory for write variation compensation of non-volatile memories at edge ai,” in 2023 7th IEEE Electron Devices Technology & Manufacturing Conference (EDTM). IEEE, 2023, pp. 1–3.
- H. Qin et al., “Forward and backward information retention for accurate binary neural networks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 2250–2259.
- S. Vaze et al., “Open-set recognition: A good closed-set classifier is all you need?” arXiv preprint arXiv:2110.06207, 2021.
- J. Rock et al., “On efficient uncertainty estimation for resource-constrained mobile applications,” arXiv preprint arXiv:2111.09838, 2021.
- S. T. Ahmed et al., “Spatial-spindrop: Spatial dropout-based binary bayesian neural network with spintronics implementation,” arXiv preprint arXiv:2306.10185, 2023.