Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Performance Tuning of a Parallel 3-D FFT Package OpenFFT (1501.07350v2)

Published 29 Jan 2015 in cs.MS and cs.DC

Abstract: The fast Fourier transform (FFT) is a primitive kernel in numerous fields of science and engineering. OpenFFT is an open-source parallel package for 3-D FFTs, built on a communication-optimal domain decomposition method for achieving minimal volume of communication. In this paper, we analyze and tune the performance of OpenFFT, paying a particular attention to tuning of communication that dominates the run time of large-scale calculations. We first analyze its performance on different machines for an understanding of the behaviors of the package and machines. Based on the performance analysis, we develop six communication methods for performing communication with the aim of covering varied calculation scales on a variety of computational platforms. OpenFFT is then augmented with an auto-tuning of communication to select the best method in run time depending on their performance. Numerical results demonstrate that the optimized OpenFFT is able to deliver relatively good performance in comparison with other state-of-the-art packages at different computational scales on a number of parallel machines.

Citations (2)

Summary

We haven't generated a summary for this paper yet.