Papers
Topics
Authors
Recent
2000 character limit reached

Automatic Library Generation for Modular Polynomial Multiplication

Published 5 Sep 2016 in cs.SC | (1609.01010v1)

Abstract: Polynomial multiplication is a key algorithm underlying computer algebra systems (CAS) and its efficient implementation is crucial for the performance of CAS. In this paper we design and implement algorithms for polynomial multiplication using approaches based the fast Fourier transform (FFT) and the truncated Fourier transform (TFT). We improve on the state-of-the-art in both theoretical and practical performance. The {\SPIRAL} library generation system is extended and used to automatically generate and tune the performance of a polynomial multiplication library that is optimized for memory hierarchy, vectorization and multi-threading, using new and existing algorithms. The performance tuning has been aided by the use of automation where many code choices are generated and intelligent search is utilized to find the "best" implementation on a given architecture. The performance of autotuned implementations is comparable to, and in some cases better than, the best hand-tuned code.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.