Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dataflow-Aware PIM-Enabled Manycore Architecture for Deep Learning Workloads (2403.19073v1)

Published 28 Mar 2024 in cs.AR, cs.AI, and cs.ET

Abstract: Processing-in-memory (PIM) has emerged as an enabler for the energy-efficient and high-performance acceleration of deep learning (DL) workloads. Resistive random-access memory (ReRAM) is one of the most promising technologies to implement PIM. However, as the complexity of Deep convolutional neural networks (DNNs) grows, we need to design a manycore architecture with multiple ReRAM-based processing elements (PEs) on a single chip. Existing PIM-based architectures mostly focus on computation while ignoring the role of communication. ReRAM-based tiled manycore architectures often involve many Processing Elements (PEs), which need to be interconnected via an efficient on-chip communication infrastructure. Simply allocating more resources (ReRAMs) to speed up only computation is ineffective if the communication infrastructure cannot keep up with it. In this paper, we highlight the design principles of a dataflow-aware PIM-enabled manycore platform tailor-made for various types of DL workloads. We consider the design challenges with both 2.5D interposer- and 3D integration-enabled architectures.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Harsh Sharma (21 papers)
  2. Gaurav Narang (1 paper)
  3. Janardhan Rao Doppa (62 papers)
  4. Umit Ogras (8 papers)
  5. Partha Pratim Pande (21 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.