Permutation-Avoiding FFT-Based Convolution (2506.12718v1)
Abstract: Fast Fourier Transform (FFT) libraries are widely used for evaluating discrete convolutions. Most FFT implementations follow some variant of the Cooley-Tukey framework, in which the transform is decomposed into butterfly operations and index-reversal permutations. While butterfly operations dominate the floating-point operation count, the memory access patterns induced by index-reversal permutations significantly degrade the FFT's arithmetic intensity. In practice, discrete convolutions are often applied repeatedly with a fixed filter. In such cases, we show that the index-reversal permutations involved in both the forward and backward transforms of standard FFT-based convolution implementations can be avoided by deferring to a single offline permutation of the filter. We propose a multi-dimensional, permutation-avoiding convolution procedure within a general radix Cooley-Tukey framework. We perform numerical experiments to benchmark our algorithms against state-of-the-art FFT-based convolution implementations. Our results suggest that developers of FFT libraries should consider supporting permutation-avoiding convolution kernels.