Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

RowClone: Accelerating Data Movement and Initialization Using DRAM (1805.03502v1)

Published 7 May 2018 in cs.AR

Abstract: In existing systems, to perform any bulk data movement operation (copy or initialization), the data has to first be read into the on-chip processor, all the way into the L1 cache, and the result of the operation must be written back to main memory. This is despite the fact that these operations do not involve any actual computation. RowClone exploits the organization and operation of commodity DRAM to perform these operations completely inside DRAM using two mechanisms. The first mechanism, Fast Parallel Mode, copies data between two rows inside the same DRAM subarray by issuing back-to-back activate commands to the source and the destination row. The second mechanism, Pipelined Serial Mode, transfers cache lines between two banks using the shared internal bus. RowClone significantly reduces the raw latency and energy consumption of bulk data copy and initialization. This reduction directly translates to improvement in performance and energy efficiency of systems running copy or initialization-intensive workloads

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (11)
  1. Vivek Seshadri (25 papers)
  2. Yoongu Kim (10 papers)
  3. Chris Fallin (4 papers)
  4. Donghyuk Lee (24 papers)
  5. Rachata Ausavarungnirun (27 papers)
  6. Gennady Pekhimenko (52 papers)
  7. Yixin Luo (15 papers)
  8. Onur Mutlu (279 papers)
  9. Phillip B. Gibbons (28 papers)
  10. Michael A. Kozuch (2 papers)
  11. Todd C. Mowry (10 papers)
Citations (17)

Summary

We haven't generated a summary for this paper yet.