Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Exploration of Activation Fault Reliability in Quantized Systolic Array-Based DNN Accelerators (2401.09509v1)

Published 17 Jan 2024 in cs.AR and cs.LG

Abstract: The stringent requirements for the Deep Neural Networks (DNNs) accelerator's reliability stand along with the need for reducing the computational burden on the hardware platforms, i.e. reducing the energy consumption and execution time as well as increasing the efficiency of DNN accelerators. Moreover, the growing demand for specialized DNN accelerators with tailored requirements, particularly for safety-critical applications, necessitates a comprehensive design space exploration to enable the development of efficient and robust accelerators that meet those requirements. Therefore, the trade-off between hardware performance, i.e. area and delay, and the reliability of the DNN accelerator implementation becomes critical and requires tools for analysis. This paper presents a comprehensive methodology for exploring and enabling a holistic assessment of the trilateral impact of quantization on model accuracy, activation fault reliability, and hardware efficiency. A fully automated framework is introduced that is capable of applying various quantization-aware techniques, fault injection, and hardware implementation, thus enabling the measurement of hardware parameters. Moreover, this paper proposes a novel lightweight protection technique integrated within the framework to ensure the dependable deployment of the final systolic-array-based FPGA implementation. The experiments on established benchmarks demonstrate the analysis flow and the profound implications of quantization on reliability, hardware performance, and network accuracy, particularly concerning the transient faults in the network's activations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. M. Taheri, “Dnn hardware reliability assessment and enhancement,” 27th IEEE European Test Symposium (ETS)., May 2022.
  2. A. Gholami, S. Kim, Z. Dong, Z. Yao, M. W. Mahoney, and K. Keutzer, “A survey of quantization methods for efficient neural network inference,” arXiv preprint arXiv:2103.13630, 2021.
  3. M. Riazati, M. Daneshtalab, M. Sjödin, and B. Lisper, “Autodeephls: Deep neural network high-level synthesis using fixed-point precision,” in 2022 IEEE 4th International Conference on Artificial Intelligence Circuits and Systems (AICAS).   IEEE, 2022, pp. 122–125.
  4. M. Taheri, M. Riazati, M. H. Ahmadilivani, M. Jenihhin, M. Daneshtalab, J. Raik, M. Sjödin, and B. Lisper, “Deepaxe: A framework for exploration of approximation and reliability trade-offs in dnn accelerators,” in 2023 24th International Symposium on Quality Electronic Design (ISQED).   IEEE, 2023, pp. 1–8.
  5. I. Choi, J.-Y. Hong, J. Jeon, and J.-S. Yang, “Rq-dnn: Reliable quantization for fault-tolerant deep neural networks,” in 2023 60th ACM/IEEE Design Automation Conference (DAC).   IEEE, 2023, pp. 1–2.
  6. M. H. Ahmadilivani, M. Taheri, J. Raik, M. Daneshtalab, and M. Jenihhin, “A systematic literature review on hardware reliability assessment methods for deep neural networks,” arXiv preprint arXiv:2305.05750, 2023.
  7. M. Taheri, M. Taheri, and A. Hadjahmadi, “Noise-tolerance gpu-based age estimation using resnet-50,” arXiv preprint arXiv:2305.00848, 2023.
  8. A. Siddique, K. Basu, and K. A. Hoque, “Exploring fault-energy trade-offs in approximate dnn hardware accelerators,” in 2021 22nd International Symposium on Quality Electronic Design (ISQED).   IEEE, 2021, pp. 343–348.
  9. G. Li, S. K. S. Hari, M. Sullivan, T. Tsai, K. Pattabiraman, J. Emer, and S. W. Keckler, “Understanding error propagation in deep learning neural network (dnn) accelerators and applications,” in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2017, pp. 1–12.
  10. P. M. Basso, F. F. dos Santos, and P. Rech, “Impact of tensor cores and mixed precision on the reliability of matrix multiplication in gpus,” IEEE Transactions on Nuclear Science, vol. 67, no. 7, pp. 1560–1565, 2020.
  11. F. Libano, P. Rech, B. Neuman, J. Leavitt, M. Wirthlin, and J. Brunhaver, “How reduced data precision and degree of parallelism impact the reliability of convolutional neural networks on fpgas,” IEEE Transactions on Nuclear Science, vol. 68, no. 5, pp. 865–872, 2021.
  12. F. Libano, B. Wilson, M. Wirthlin, P. Rech, and J. Brunhaver, “Understanding the impact of quantization, accuracy, and radiation on the reliability of convolutional neural networks on fpgas,” IEEE Transactions on Nuclear Science, vol. 67, no. 7, pp. 1478–1484, 2020.
  13. R. T. Syed, M. Ulbricht, K. Piotrowski, and M. Krstic, “Fault resilience analysis of quantized deep neural networks,” in 2021 IEEE 32nd International Conference on Microelectronics (MIEL).   IEEE, 2021, pp. 275–279.
  14. A. P. Arechiga and A. J. Michaels, “The effect of weight errors on neural networks,” in 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC).   IEEE, 2018, pp. 190–196.
  15. L.-H. Hoang, M. A. Hanif, and M. Shafique, “Ft-clipact: Resilience analysis of deep neural networks and improving their fault tolerance using clipped activation,” in 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE).   IEEE, 2020, pp. 1241–1246.
  16. S. I. Venieris, A. Kouris, and C.-S. Bouganis, “Toolflows for mapping convolutional neural networks on fpgas: A survey and future directions,” arXiv preprint arXiv:1803.05900, 2018.
  17. K. Guo, S. Zeng, J. Yu, Y. Wang, and H. Yang, “[dl] a survey of fpga-based neural network inference accelerators,” ACM Transactions on Reconfigurable Technology and Systems (TRETS), vol. 12, no. 1, pp. 1–26, 2019.
  18. K. Abdelouahab, M. Pelcat, J. Serot, and F. Berry, “Accelerating cnn inference on fpgas: A survey,” arXiv preprint arXiv:1806.01683, 2018.
  19. R. S. Molina, V. Gil-Costa, M. L. Crespo, and G. Ramponi, “High-level synthesis hardware design for fpga-based accelerators: Models, methodologies, and frameworks,” IEEE Access, vol. 10, pp. 90 429–90 455, 2022.
  20. Y. Umuroglu, N. J. Fraser, G. Gambardella, M. Blott, P. Leong, M. Jahre, and K. Vissers, “Finn: A framework for fast, scalable binarized neural network inference,” in Proceedings of the 2017 ACM/SIGDA international symposium on field-programmable gate arrays, 2017, pp. 65–74.
  21. S. I. Venieris and C.-S. Bouganis, “fpgaconvnet: Mapping regular and irregular convolutional neural networks on fpgas,” IEEE transactions on neural networks and learning systems, vol. 30, no. 2, pp. 326–342, 2018.
  22. A. Ghaffari and Y. Savaria, “Cnn2gate: Toward designing a general framework for implementation of convolutional neural networks on fpga,” arXiv preprint arXiv:2004.04641, 2020.
  23. P. G. Mousouliotis and L. P. Petrou, “Cnn-grinder: from algorithmic to high-level synthesis descriptions of cnns for low-end-low-cost fpga socs,” Microprocessors and Microsystems, vol. 73, p. 102990, 2020.
  24. J. E. Stone, D. Gohara, and G. Shi, “Opencl: A parallel programming standard for heterogeneous computing systems,” Computing in science & engineering, vol. 12, no. 3, p. 66, 2010.
  25. E. Wang, J. J. Davis, and P. Y. Cheung, “A pynq-based framework for rapid cnn prototyping,” in 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).   IEEE, 2018, pp. 223–223.
  26. R. Leveugle, A. Calvez, P. Maistri, and P. Vanhauwaert, “Statistical fault injection: Quantified error and confidence,” in 2009 Design, Automation & Test in Europe Conference & Exhibition.   IEEE, 2009, pp. 502–506.
Citations (5)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets