Processing Data Where It Makes Sense: Enabling In-Memory Computation (1903.03988v1)

Published 10 Mar 2019 in cs.AR

Abstract: Today's systems are overwhelmingly designed to move data to computation. This design choice goes directly against at least three key trends in systems that cause performance, scalability and energy bottlenecks: (1) data access from memory is already a key bottleneck as applications become more data-intensive and memory bandwidth and energy do not scale well, (2) energy consumption is a key constraint in especially mobile and server systems, (3) data movement is very expensive in terms of bandwidth, energy and latency, much more so than computation. At the same time, conventional memory technology is facing many scaling challenges in terms of reliability, energy, and performance. As a result, memory system architects are open to organizing memory in different ways and making it more intelligent, at the expense of higher cost. The emergence of 3D-stacked memory plus logic as well as the adoption of error correcting codes inside DRAM chips, and the necessity for designing new solutions to serious reliability and security issues, such as the RowHammer phenomenon, are an evidence of this trend. Recent research aims to practically enable computation close to data. We discuss at least two promising directions for processing-in-memory (PIM): (1) performing massively-parallel bulk operations in memory by exploiting the analog operational properties of DRAM, with low-cost changes, (2) exploiting the logic layer in 3D-stacked memory technology to accelerate important data-intensive applications. In both approaches, we describe and tackle relevant cross-layer research, design, and adoption challenges in devices, architecture, systems, and programming models. Our focus is on the development of in-memory processing designs that can be adopted in real computing platforms at low cost.

Authors (4)

Onur Mutlu (279 papers)
Saugata Ghose (59 papers)
Juan Gómez-Luna (57 papers)
Rachata Ausavarungnirun (27 papers)

Citations (207)

View on Semantic Scholar

Summary

Enabling In-Memory Computation: A Paradigm Shift for Data-Intensive Systems

The paper "Processing Data Where It Makes Sense: Enabling In-Memory Computation" by Onur Mutlu and collaborators dives deep into the intricacies of modern computing systems and presents a compelling argument for a fundamental shift towards processing-in-memory (PIM) architectures. Designed to counteract the performance and energy inefficiencies associated with traditional processor-centric architectures, the paper highlights the need for intelligent memory systems that position computation closer to where the data resides.

Key Motivations and Challenges

The fundamental premise of the paper is the intrinsic inefficiencies in data movement across the memory hierarchy in contemporary systems. These inefficiencies arise from several key trends that reveal issues regarding memory bandwidth, energy constraints, and data movement costs, which are more pronounced than computation costs. Specifically, the paper identifies challenges in DRAM scaling related to reliability, energy, and performance, compounded by phenomena such as RowHammer, which pose reliability and potential security risks.

These factors suggest an urgent need to evolve beyond conventional designs, prompting the paper to explore practical implementations for in-memory computation. The authors highlight two major avenues for PIM: exploiting the inherent analog operations of DRAM and utilizing the logic layer in 3D-stacked memory to handle large-scale data operations efficiently.

Processing-In-Memory Approaches

The authors propose two primary approaches to enabling in-memory computation:

Minimal Changes to DRAM Chips: This involves leveraging the existing DRAM architecture to perform bulk operations with minimal modifications. Mechanisms like RowClone for rapid data copy and initialization, and Ambit for bulk bitwise operations showcase how DRAM can inherently handle specific tasks proficiently within the chip, thereby reducing latency and energy consumption significantly.
Utilizing 3D-Stacked Memory: This approach takes advantage of the stacked architecture to embed processing logic directly into memory. Techniques and architectures like Tesseract for graph processing and consumer workload optimizations provide high internal bandwidth and low latency processing capabilities, demonstrating substantial improvements in performance and energy efficiency across a range of applications, from databases to machine learning.

Implications and Future Directions

The implications of adopting PIM architectures are profound, suggesting notable reductions in energy consumption and execution time for data-intensive applications by trimming down unnecessary data movement. The integration of PIM presents a transformative opportunity for designing holistic, data-centric systems that overcome existing bottlenecks.

The paper paves the way for further exploration into challenges that must be addressed for widespread PIM adoption. These include: developing robust programming models, devising efficient memory coherence protocols, addressing virtual memory challenges, and optimizing data/code mapping for improved efficiency. Furthermore, a standard benchmark suite for evaluating PIM implementations can accelerate research and encourage standardization.

Concluding Thoughts

In advocating for a shift to in-memory computation, the authors underscore an essential transition in system architecture design aimed at improving the scalability and efficiency of data-intensive computing systems. By moving towards a data-centric paradigm, future systems could harness notable gains in performance and energy efficiency, fostering the development of advanced applications and novel computing architectures. Researchers and practitioners in the field are thus urged to explore and address the remaining challenges, bolstering the potential for PIM architectures to redefine computing's landscape.

PDF Markdown