Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SiTe CiM: Signed Ternary Computing-in-Memory for Ultra-Low Precision Deep Neural Networks (2408.13617v1)

Published 24 Aug 2024 in cs.AR

Abstract: Ternary Deep Neural Networks (DNN) have shown a large potential for highly energy-constrained systems by virtue of their low power operation (due to ultra-low precision) with only a mild degradation in accuracy. To enable an energy-efficient hardware substrate for such systems, we propose a compute-enabled memory design, referred to as SiTe-CiM, which features computing-in-memory (CiM) of dot products between signed ternary (SiTe) inputs and weights. SiTe CiM is based on cross-coupling of two bit cells to enable CiM of dot products in the signed ternary regime. We explore SiTe CiM with 8T-SRAM, 3T-embedded DRAM (3T-eDRAM) and 3T-ferroelectric metal FET (FEMFET) memories. We propose two flavors of this technique, namely SiTe CiM I/II. In SiTe CiM I, we employ two additional transistors per cell for cross-coupling, achieving fast CiM operations, albeit incurring an area overhead ranging from 18% to 34% (compared to standard ternary memories). In SiTe CiM II, four extra transistors are utilized for every 16 cells in a column, thereby incurring only 6% area cost (but leading to slower CiM than SiTe CiM I). Based on the array analysis, our designs achieve up to 88% lower CiM latency and 78% CiM energy savings across various technologies considered, as compared to their respective near-memory computing counterparts. Further, we perform system level analysis by incorporating SiTe CiM I/II arrays in a ternary DNN accelerator and show up to 7X throughput boost and up to 2.5X energy reduction compared to the near-memory ternary DNN accelerators.

Summary

We haven't generated a summary for this paper yet.