Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 79 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 45 tok/s
GPT-5 High 43 tok/s Pro
GPT-4o 103 tok/s
GPT OSS 120B 475 tok/s Pro
Kimi K2 215 tok/s Pro
2000 character limit reached

ISA Mapper: A Compute and Hardware Agnostic Deep Learning Compiler (1810.09958v1)

Published 12 Oct 2018 in cs.DC and cs.LG

Abstract: Domain specific accelerators present new challenges and opportunities for code generation onto novel instruction sets, communication fabrics, and memory architectures. In this paper we introduce an intermediate representation (IR) which enables both deep learning computational kernels and hardware capabilities to be described in the same IR. We then formulate and apply instruction mapping to determine the possible ways a computation can be performed on a hardware system. Next, our scheduler chooses a specific mapping and determines the data movement and computation order. In order to manage the large search space of mappings and schedules, we developed a flexible framework that allows heuristics, cost models, and potentially machine learning to facilitate this search problem. With this system, we demonstrate the automated extraction of matrix multiplication kernels out of recent deep learning kernels such as depthwise-separable convolution. In addition, we demonstrate two to five times better performance on DeepBench sized GEMMs and GRU RNN execution when compared to state-of-the-art (SOTA) implementations on new hardware and up to 85% of the performance for SOTA implementations on existing hardware.

Citations (8)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.