Slide FFT on a homogeneous mesh in wafer-scale computing (2401.05427v1)
Abstract: Searches for signals at low signal-to-noise ratios frequently involve the Fast Fourier Transform (FFT). For high-throughput searches, we here consider FFT on the homogeneous mesh of Processing Elements (PEs) of a wafer-scale engine (WSE). To minimize memory overhead in the inherently non-local FFT algorithm, we introduce a new synchronous slide operation ({\em Slide}) exploiting the fast interconnect between adjacent PEs. Feasibility of compute-limited performance is demonstrated in linear scaling of Slide execution times with varying array size in preliminary benchmarks on the CS-2 WSE. The proposed implementation appears opportune to accelerate and open the full discovery potential of FFT-based signal processing in multi-messenger astronomy.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.