Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A decomposition method with minimum communication amount for parallelization of multi-dimensional FFTs (1302.6189v1)

Published 6 Feb 2013 in cs.NA and cond-mat.mtrl-sci

Abstract: The fast Fourier transform (FFT) is undoubtedly an essential primitive that has been applied in various fields of science and engineering. In this paper, we present a decomposition method for parallelization of multi-dimensional FFTs with smallest communication amount for all ranges of the number of processes compared to previously proposed methods. This is achieved by two distinguishing features: adaptive decomposition and transpose order awareness. In the proposed method, the FFT data are decomposed based on a row-wise basis that maps the multi-dimensional data into one-dimensional data, and translates the corresponding coordinates from multi-dimensions into one-dimension so that the resultant one-dimensional data can be divided and allocated equally to the processes. As a result, differently from previous works that have the dimensions of decomposition pre-defined, our method can adaptively decompose the FFT data on the lowest possible dimensions depending on the number of processes. In addition, this row-wise decomposition provides plenty of alternatives in data transpose, and different transpose order results in different amount of communication. We identify the best transpose orders with smallest communication amounts for the 3-D, 4-D, and 5-D FFTs by analyzing all possible cases. Given both communication efficiency and scalability, our method is promising in development of highly efficient parallel packages for the FFT.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Truong Vinh Truong Duy (6 papers)
  2. Taisuke Ozaki (43 papers)
Citations (29)

Summary

We haven't generated a summary for this paper yet.