Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Special Session: Reliability Analysis for ML/AI Hardware (2103.12166v2)

Published 22 Mar 2021 in cs.AR

Abstract: AI and Machine Learning (ML) are becoming pervasive in today's applications, such as autonomous vehicles, healthcare, aerospace, cybersecurity, and many critical applications. Ensuring the reliability and robustness of the underlying AI/ML hardware becomes our paramount importance. In this paper, we explore and evaluate the reliability of different AI/ML hardware. The first section outlines the reliability issues in a commercial systolic array-based ML accelerator in the presence of faults engendering from device-level non-idealities in the DRAM. Next, we quantified the impact of circuit-level faults in the MSB and LSB logic cones of the Multiply and Accumulate (MAC) block of the AI accelerator on the AI/ML accuracy. Finally, we present two key reliability issues -- circuit aging and endurance in emerging neuromorphic hardware platforms and present our system-level approach to mitigate them.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Shamik Kundu (9 papers)
  2. Kanad Basu (23 papers)
  3. Mehdi Sadi (9 papers)
  4. Twisha Titirsha (8 papers)
  5. Shihao Song (22 papers)
  6. Anup Das (48 papers)
  7. Ujjwal Guin (14 papers)
Citations (16)