Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Stripe: Tensor Compilation via the Nested Polyhedral Model (1903.06498v1)

Published 14 Mar 2019 in cs.DC and cs.LG

Abstract: Hardware architectures and ML libraries evolve rapidly. Traditional compilers often fail to generate high-performance code across the spectrum of new hardware offerings. To mitigate, engineers develop hand-tuned kernels for each ML library update and hardware upgrade. Unfortunately, this approach requires excessive engineering effort to scale or maintain with any degree of state-of-the-art performance. Here we present a Nested Polyhedral Model for representing highly parallelizable computations with limited dependencies between iterations. This model provides an underlying framework for an intermediate representation (IR) called Stripe, amenable to standard compiler techniques while naturally modeling key aspects of modern ML computing. Stripe represents parallelism, efficient memory layout, and multiple compute units at a level of abstraction amenable to automatic optimization. We describe how Stripe enables a compiler for ML in the style of LLVM that allows independent development of algorithms, optimizations, and hardware accelerators. We also discuss the design exploration advantages of Stripe over kernel libraries and schedule-based or schedule-space-based code generation.

Citations (31)

Summary

We haven't generated a summary for this paper yet.