Papers
Topics
Authors
Recent
Search
2000 character limit reached

Sparse Matrix to Matrix Multiplication: A Representation and Architecture for Acceleration (long version)

Published 2 Jun 2019 in cs.AR | (1906.00327v1)

Abstract: Accelerators for sparse matrix multiplication are important components in emerging systems. In this paper, we study the main challenges of accelerating Sparse Matrix Multiplication (SpMM). For the situations that data is not stored in the desired order (row/column order), we propose a compact high performance sparse format, which allows for random access to a dataset with low memory access overhead. We show that using this format results in a 14-49 times speedup for SpMM. Next, we propose a high performance systolic architecture for SpMM, which uses a mesh of comparators to locate the useful (non-zero) computation. This design maximizes data reuse by sharing the input data among a row/column of the mesh. We also show that, with similar memory access assumptions, the proposed architecture results in a 9-30 times speedup in comparison with the state of the art.

Citations (3)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.