Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

QDOT: Quantized Dot Product Kernel for Approximate High-Performance Computing (2105.00115v1)

Published 30 Apr 2021 in math.NA, cs.MS, and cs.NA

Abstract: Approximate computing techniques have been successful in reducing computation and power costs in several domains. However, error sensitive applications in high-performance computing are unable to benefit from existing approximate computing strategies that are not developed with guaranteed error bounds. While approximate computing techniques can be developed for individual high-performance computing applications by domain specialists, this often requires additional theoretical analysis and potentially extensive software modification. Hence, the development of low-level error-bounded approximate computing strategies that can be introduced into any high-performance computing application without requiring additional analysis or significant software alterations is desirable. In this paper, we provide a contribution in this direction by proposing a general framework for designing error-bounded approximate computing strategies and apply it to the dot product kernel to develop qdot -- an error-bounded approximate dot product kernel. Following the introduction of qdot, we perform a theoretical analysis that yields a deterministic bound on the relative approximation error introduced by qdot. Empirical tests are performed to illustrate the tightness of the derived error bound and to demonstrate the effectiveness of qdot on a synthetic dataset, as well as two scientific benchmarks -- Conjugate Gradient (CG) and the Power method. In particular, using qdot for the dot products in CG can result in a majority of components being perforated or quantized to half precision without increasing the iteration count required for convergence to the same solution as CG using a double precision dot product.

Citations (3)

Summary

We haven't generated a summary for this paper yet.