Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A model-driven approach for a new generation of adaptive libraries (1806.07060v1)

Published 19 Jun 2018 in cs.PF, cs.DC, cs.MS, and cs.SE

Abstract: Efficient high-performance libraries often expose multiple tunable parameters to provide highly optimized routines. These can range from simple loop unroll factors or vector sizes all the way to algorithmic changes, given that some implementations can be more suitable for certain devices by exploiting hardware characteristics such as local memories and vector units. Traditionally, such parameters and algorithmic choices are tuned and then hard-coded for a specific architecture and for certain characteristics of the inputs. However, emerging applications are often data-driven, thus traditional approaches are not effective across the wide range of inputs and architectures used in practice. In this paper, we present a new adaptive framework for data-driven applications which uses a predictive model to select the optimal algorithmic parameters by training with synthetic and real datasets. We demonstrate the effectiveness of a BLAS library and specifically on its matrix multiplication routine. We present experimental results for two GPU architectures and show significant performance gains of up to 3x (on a high-end NVIDIA Pascal GPU) and 2.5x (on an embedded ARM Mali GPU) when compared to a traditionally optimized library.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Marco Cianfriglia (4 papers)
  2. Flavio Vella (25 papers)
  3. Cedric Nugteren (6 papers)
  4. Anton Lokhmotov (11 papers)
  5. Grigori Fursin (20 papers)
Citations (13)