Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SMAT: An Input Adaptive Sparse Matrix-Vector Multiplication Auto-Tuner (1210.2536v1)

Published 9 Oct 2012 in cs.MS and cs.DC

Abstract: Sparse matrix vector multiplication (SpMV) is an important kernel in scientific and engineering applications. The previous optimizations are sparse matrix format specific and expose the choice of the best format to application programmers. In this work we develop an auto-tuning framework to bridge gap between the specific optimized kernels and their general-purpose use. We propose an SpMV auto-tuner (SMAT) that provides an unified interface based on compressed sparse row (CSR) to programmers by implicitly choosing the best format and the fastest implementation of any input sparse matrix in runtime. SMAT leverage a data mining model, which is formulated based on a set of performance parameters extracted from 2373 matrices in UF sparse matrix collection, to fast search the best combination. The experiments show that SMAT achieves the maximum performance of 75 GFLOP/s in single-precision and 33 GFLOP/s in double-precision on Intel, and 41 GFLOP/s in single-precision and 34 GFLOP/s in double-precision on AMD. Compared with the sparse functions in MKL library, SMAT runs faster by more than 3 times.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Jiajia Li (43 papers)
  2. Xiuxia Zhang (1 paper)
  3. Guangming Tan (20 papers)
  4. Mingyu Chen (31 papers)
Citations (4)