Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

X-SRAM: Enabling In-Memory Boolean Computations in CMOS Static Random Access Memories (1712.05096v2)

Published 14 Dec 2017 in cs.ET

Abstract: Silicon-based Static Random Access Memories (SRAM) and digital Boolean logic have been the workhorse of the state-of-art computing platforms. Despite tremendous strides in scaling the ubiquitous metal-oxide-semiconductor transistor, the underlying \textit{von-Neumann} computing architecture has remained unchanged. The limited throughput and energy-efficiency of the state-of-art computing systems, to a large extent, results from the well-known \textit{von-Neumann bottleneck}. The energy and throughput inefficiency of the von-Neumann machines have been accentuated in recent times due to the present emphasis on data-intensive applications like artificial intelligence, machine learning \textit{etc}. A possible approach towards mitigating the overhead associated with the von-Neumann bottleneck is to enable \textit{in-memory} Boolean computations. In this manuscript, we present an augmented version of the conventional SRAM bit-cells, called \textit{the X-SRAM}, with the ability to perform in-memory, vector Boolean computations, in addition to the usual memory storage operations. We propose at least six different schemes for enabling in-memory vector computations including NAND, NOR, IMP (implication), XOR logic gates with respect to different bit-cell topologies $-$ the 8T cell and the 8$+$T Differential cell. In addition, we also present a novel \textit{`read-compute-store'} scheme, wherein the computed Boolean function can be directly stored in the memory without the need of latching the data and carrying out a subsequent write operation. The feasibility of the proposed schemes has been verified using predictive transistor models and Monte-Carlo variation analysis.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Amogh Agrawal (10 papers)
  2. Akhilesh Jaiswal (29 papers)
  3. Chankyu Lee (12 papers)
  4. Kaushik Roy (265 papers)
Citations (193)

Summary

Enabling In-Memory Boolean Computations in CMOS Static RAM: An Analysis

The paper "X-SRAM: Enabling In-Memory Boolean Computations in CMOS Static Random Access Memories" presents a novel approach to alleviate the von-Neumann bottleneck faced by traditional computing architectures. It introduces the concept of X-SRAM, an augmented form of the conventional SRAM, capable of performing in-memory Boolean computations. This development is driven by the increasing demand for energy efficiency and throughput in data-intensive applications such as AI and cryptography.

Overview of X-SRAM

SRAM, a staple in modern computing, traditionally serves as a storage medium, relying on separate computational units to process data. The paper highlights the inefficiencies inherent in this architecture, primarily the energy and throughput costs associated with data movement between memory and processing units. X-SRAM addresses these inefficiencies by integrating vector Boolean computations directly within the SRAM bit-cells, leveraging standard CMOS technology.

The authors propose several schemes for in-memory computation using two variants of CMOS SRAM cells: the 8-transistor (8T) cell and the 8+^+T differential cell. Both variants are explored for their capability to perform basic logic operations such as NAND, NOR, IMP, and XOR gates.

Technical Details

  1. 8T SRAM Bit-Cells:
    • NOR and NAND Operations: The paper demonstrates the use of skewed inverters for NOR and NAND operations within the 8T cell. The circuit exploits the cell's isolated read port, enabling simultaneous activation of multiple read word lines without read-disturb concerns.
    • Voltage Divider Scheme: For XOR and IMP operations, the authors employ a voltage divider method by activating specific transistors within the cell structure.
  2. 8+^+T Differential SRAM:
    • Leveraging differential sensing similar to traditional 6T cells, the paper introduces asymmetric sense amplifiers to achieve NAND/NOR operations more robustly. This variant supports the proposed read-compute-store (RCS) scheme due to its decoupled read-write paths.

The feasibility of these in-memory schemes is substantiated through predictive transistor model simulations and comprehensive variation analysis, ensuring robustness across process corners and environmental conditions.

Implications and Future Directions

The introduction of X-SRAM presents considerable implications for future computing systems. By reducing the need for data transfer between memory and processors, the energy footprint and limiting effects of the von-Neumann bottleneck are significantly mitigated. The practical implications are illustrated through the implementation of AES encryption on an X-SRAM-equipped architecture, showing up to 75% reductions in memory accesses.

This work expands the potential of traditional architectures, paving the way for more efficient, high-throughput systems. Future research could explore the integration of X-SRAM into varied computing paradigms beyond cryptographic applications, potentially impacting neural networks and large-scale data processing tasks.

Conclusion

This paper successfully addresses the persistent challenge of energy inefficiency within computing systems, introducing a viable solution via X-SRAM. Its strategic integration of logic within memory not only enhances operational efficiency but also opens new avenues for computational innovation. As the demand for high-performance systems grows, so too does the significance of advancements like X-SRAM, heralding a shift towards more intelligent and efficient computing architectures.