A Customized NoC Architecture to Enable Highly Localized Computing-On-the-Move DNN Dataflow (2111.11744v2)

Published 23 Nov 2021 in cs.AR

Abstract: The ever-increasing computation complexity of fastgrowing Deep Neural Networks (DNNs) has requested new computing paradigms to overcome the memory wall in conventional Von Neumann computing architectures. The emerging Computing-In-Memory (CIM) architecture has been a promising candidate to accelerate neural network computing. However, data movement between CIM arrays may still dominate the total power consumption in conventional designs. This paper proposes a flexible CIM processor architecture named Domino and "Computing-On-the-Move" (COM) dataflow, to enable stream computing and local data access to significantly reduce data movement energy. Meanwhile, Domino employs customized distributed instruction scheduling within Network-on-Chip (NoC) to implement inter-memory computing and attain mapping flexibility. The evaluation with prevailing DNN models shows that Domino achieves 1.77-to-2.37$\times$ power efficiency over several state-of-the-art CIM accelerators and improves the throughput by 1.28-to-13.16$\times$.

PDF Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

Authors (5)

Kaining Zhou (2 papers)
Yangshuo He (4 papers)
Rui Xiao (18 papers)
Jiayi Liu (60 papers)
Kejie Huang (24 papers)

Citations (1)

View on Semantic Scholar

A Customized NoC Architecture to Enable Highly Localized Computing-On-the-Move DNN Dataflow (2111.11744v2)

Related Papers