Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MPGemmFI: A Fault Injection Technique for Mixed Precision GEMM in ML Applications (2311.05782v1)

Published 9 Nov 2023 in cs.DC

Abstract: Emerging deep learning workloads urgently need fast general matrix multiplication (GEMM). To meet such demand, one of the critical features of machine-learning-specific accelerators such as NVIDIA Tensor Cores, AMD Matrix Cores, and Google TPUs is the support of mixed-precision enabled GEMM. For DNN models, lower-precision FP data formats and computation offer acceptable correctness but significant performance, area, and memory footprint improvement. While promising, the mixed-precision computation on error resilience remains unexplored. To this end, we develop a fault injection framework that systematically injects fault into the mixed-precision computation results. We investigate how the faults affect the accuracy of machine learning applications. Based on the error resilience characteristics, we offer lightweight error detection and correction solutions that significantly improve the overall model accuracy if the models experience hardware faults. The solutions can be efficiently integrated into the accelerator's pipelines.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (12)
  1. Bo Fang (26 papers)
  2. Xinyi Li (97 papers)
  3. Harvey Dam (5 papers)
  4. Cheng Tan (140 papers)
  5. Siva Kumar Sastry Hari (10 papers)
  6. Timothy Tsai (9 papers)
  7. Ignacio Laguna (12 papers)
  8. Dingwen Tao (60 papers)
  9. Ganesh Gopalakrishnan (27 papers)
  10. Prashant Nair (5 papers)
  11. Kevin Barker (16 papers)
  12. Ang Li (472 papers)