Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

166 tokens/sec

GPT-4o

7 tokens/sec

Gemini 2.5 Pro Pro

42 tokens/sec

o3 Pro

4 tokens/sec

GPT-4.1 Pro

38 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

30 1

MIMDRAM: An End-to-End Processing-Using-DRAM System for High-Throughput, Energy-Efficient and Programmer-Transparent Multiple-Instruction Multiple-Data Processing (2402.19080v2)

Published 29 Feb 2024 in cs.AR and cs.DC

Abstract: Processing-using-DRAM (PUD) is a processing-in-memory (PIM) approach that uses a DRAM array's massive internal parallelism to execute very-wide data-parallel operations, in a single-instruction multiple-data (SIMD) fashion. However, DRAM rows' large and rigid granularity limit the effectiveness and applicability of PUD in three ways. First, since applications have varying degrees of SIMD parallelism, PUD execution often leads to underutilization, throughput loss, and energy waste. Second, most PUD architectures are limited to the execution of parallel map operations. Third, the need to feed the wide DRAM row with tens of thousands of data elements combined with the lack of adequate compiler support for PUD systems create a programmability barrier. Our goal is to design a flexible PUD system that overcomes the limitations caused by the large and rigid granularity of PUD. To this end, we propose MIMDRAM, a hardware/software co-designed PUD system that introduces new mechanisms to allocate and control only the necessary resources for a given PUD operation. The key idea of MIMDRAM is to leverage fine-grained DRAM (i.e., the ability to independently access smaller segments of a large DRAM row) for PUD computation. MIMDRAM exploits this key idea to enable a multiple-instruction multiple-data (MIMD) execution model in each DRAM subarray. We evaluate MIMDRAM using twelve real-world applications and 495 multi-programmed application mixes. Our evaluation shows that MIMDRAM provides 34x the performance, 14.3x the energy efficiency, 1.7x the throughput, and 1.3x the fairness of a state-of-the-art PUD framework, along with 30.6x and 6.8x the energy efficiency of a high-end CPU and GPU, respectively. MIMDRAM adds small area cost to a DRAM chip (1.11%) and CPU die (0.6%).

References (241)

Citations (5)

View on Semantic Scholar

Summary

The paper introduces MIMDRAM, which improves energy efficiency up to 14.3x and performance 34x by enabling flexible, fine-grained DRAM activation.
It leverages custom intra- and inter-mat connectivity to reduce data movement overhead and remove the need for CPU intervention during vector reductions.
The system features a co-designed compiler that auto-vectorizes loops and efficiently schedules instructions, maximizing concurrent DRAM operations.

Overview of MIMDRAM: A Processing-Using-DRAM System for High Throughput

The paper introduces MIMDRAM, an innovative processing-using-DRAM (PuD) architecture designed to enhance the efficiency and applicability of memory intensive computing by integrating a flexible, fine-grained DRAM execution model. With the increasing energy bottlenecks and latency issues associated with data movement in traditional computing architectures, this proposal aligns computation closer to memory, utilizing the inherent parallelism of DRAM arrays.

Key Contributions and Methodology

MIMDRAM seeks to address the inefficiencies present in existing PuD systems like SIMDRAM, which are constrained by the fixed granularity of DRAM operations — often leading to underutilized resources. The paper presents a hardware and software co-designed system with the following improvements:

Fine-Grained DRAM Activation: By modifying DRAM's access circuitry, MIMDRAM allows independent operation of DRAM mats within a subarray. This flexibility permits execution of multiple, concurrent PuD operations tailored to the specific data parallelism levels of applications, leading to enhanced SIMD utilization.
Intra and Inter-Mat Connectivity: Implementing low-cost interconnect mechanisms within and across DRAM mats facilitates efficient data movement, crucial for operations involving vector reductions — a task traditionally relying on CPU intervention.
Compilation and Scheduling Support: MIMDRAM incorporates compiler technologies to auto-vectorize loops and distribute computations effectively across available DRAM mats, hence optimizing execution and minimizing energy consumption.

Evaluation and Results

The evaluation against traditional processing paradigms like CPUs and GPUs, as well as the SIMDRAM architecture, showcases significant gains in energy efficiency and performance:

MIMDRAM achieves up to 14.3x the energy efficiency and 34x the performance of SIMDRAM, largely due to its ability to dynamically adapt to the application's needs and reduce data movement.
It outperforms both CPUs and GPUs by leveraging in situ computations and effectively exploiting multi-instruction multiple-data (MIMD) execution paradigms even within single DRAM subarrays.
A broader scalability paper indicates that extending this paradigm across multiple subarrays and DRAM banks can further enhance performance, potentially surpassing the limitations of traditional processor-focused architectures.

Implications and Future Directions

MIMDRAM illustrates a pathway towards more energy-efficient and high-performance computing architectures by embracing a closer integration of computation and memory. Its design encourages further exploration into hybrid architectures that leverage both in-memory and near-memory computation for broader classes of applications. This could very well inform the development of future intelligent computing systems, potentially influencing the design principles of data-centric computing in a wide range of applications from AI to complex data analytics.

In summary, MIMDRAM represents a forward-thinking approach to address current bottlenecks in computational efficiency, paving the way for more agile and scalable processing infrastructures in the context of burgeoning data-intensive workloads.

PDF Markdown

Tweets

https://twitter.com/SAFARI_ETH_CMU/status/1764723536457851085

https://twitter.com/SAFARI_ETH_CMU/status/1764349947590181080

https://twitter.com/SAFARI_ETH_CMU/status/1830909388028067942

https://twitter.com/HPCPapers/status/1764894271809716382

YouTube

Show All Videos