Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TensorLib: A Spatial Accelerator Generation Framework for Tensor Algebra (2104.12339v1)

Published 26 Apr 2021 in cs.AR

Abstract: Tensor algebra finds applications in various domains, and these applications, especially when accelerated on spatial hardware accelerators, can deliver high performance and low power. Spatial hardware accelerator exhibits complex design space. Prior approaches based on manual implementation lead to low programming productivity, rendering thorough design space exploration impossible. In this paper, we propose TensorLib, a framework for generating spatial hardware accelerator for tensor algebra applications. TensorLib is motivated by the observation that, different dataflows share common hardware modules, which can be reused across different designs. To build such a framework, TensorLib first uses Space-Time Transformation to explore different dataflows, which can compactly represent the hardware dataflow using a simple transformation matrix. Next, we identify the common structures of different dataflows and build parameterized hardware module templates with Chisel. Our generation framework can select the needed hardware modules for each dataflow, connect the modules using a specified interconnection pattern, and automatically generate the complete hardware accelerator design. TensorLib remarkably improves the productivity for the development and optimization of spatial hardware architecture, providing a rich design space with trade-offs in performance, area, and power. Experiments show that TensorLib can automatically generate hardware designs with different dataflows and achieve 21\% performance improvement on FPGA compared to the state-of-the-arts.

Citations (25)

Summary

We haven't generated a summary for this paper yet.