Frequency-Domain Convolution

Updated 25 February 2026

Frequency-domain convolution is a method that transforms signals into the frequency domain, where convolution becomes efficient pointwise multiplication with O(N log N) complexity.
It is widely applied in digital filtering, image processing, and neural network acceleration, converting spatial convolution into scalable spectral operations.
This approach supports advanced applications such as dynamic convolution in deep learning, privacy-preserving computations with NTT, and spectral-based time series modeling.

Frequency-domain convolution refers to the computation of convolution operations via linear transformations to the frequency domain, where convolution becomes a pointwise product. This approach exploits the convolution theorem, which states that the (linear or circular) convolution of two signals in the time (or spatial) domain corresponds to elementwise multiplication of their spectra. Frequency-domain convolution forms the algorithmic basis of fast signal processing, efficient deep network architectures, homomorphic inference in privacy-preserving computing, and recent advances in frequency-aware neural networks, as well as classical digital filtering and image processing. Both continuous and discrete domains are relevant, as are real, complex, or finite-field signal representations.

1. Mathematical Foundations and Computational Principles

Let $x[n]$ and $h[n]$ be discrete signals of length $N$ . Their linear convolution is

$y[n] = (x * h)[n] = \sum_{m=0}^{N-1} x[m] h[n-m].$

The Convolution Theorem states that, for any suitable transform $\mathcal{F}$ (Fourier, Hartley, Cosine, etc.),

$\mathcal{F}\{x * h\} = \mathcal{F}\{x\} \cdot \mathcal{F}\{h\},$

where $\cdot$ denotes pointwise (Hadamard) multiplication. In the context of DFT or FFT, this result enables convolution via $\mathcal{O}(N\log N)$ complexity, compared to the $\mathcal{O}(N^2)$ of direct computation (Francesca et al., 2016, Johansson et al., 2023).

For 2D signals, such as images, one employs the 2D DFT or alternative transforms (e.g., Hartley, DCT, DWT), similarly achieving a reduction in computational cost for large kernels or high resolutions (Goh et al., 2021, Li et al., 2017).

In number-theoretic settings (finite fields), an analogous strategy employs the number-theoretic transform (NTT), enabling convolution over finite fields for lattices and cryptographic constructions (Bian et al., 2020, Khan et al., 2019).

2. Canonical Algorithms and Implementation Paradigms

Classic FIR Filtering

For FIR filtering, overlap-add and overlap-save algorithms segment the input into blocks, compute blockwise FFTs, and reconstruct the time-domain output through overlap aggregation. The block FFT size $N$ is chosen as $h[n]$ 0 to hold a block of $h[n]$ 1 input samples and an $h[n]$ 2-tap impulse response. Block processing and design rules are detailed in (Johansson et al., 2023):

Algorithm	Description	Key Steps
Overlap-Add	Nonoverlapping input blocks, sum overlapped outputs	FFT $h[n]$ 3 multiply $h[n]$ 4 IFFT $h[n]$ 5 add
Overlap-Save	Overlapping input blocks, drop initial artifacts	FFT $h[n]$ 6 multiply $h[n]$ 7 IFFT $h[n]$ 8 trim

Optimal FFT block size is $h[n]$ 9 for kernel length $N$ 0, with frequency-domain algorithms universally more efficient than time-domain approaches for general FIR filters (Johansson et al., 2023).

Neural Networks: Frequency-Domain Convolutions

Frequency-domain convolutional layers are realized via global transforms of input and kernels, elementwise spectral multiplication, and inverse transforms. Key implementation steps include:

Zero-pad kernels to input size.
Compute FFT/DFT (or DWT/DCT) of both input and kernel.
Multiply spectra (possibly with channel summing).
Inverse transform to the spatial domain.

Complex-valued or real-valued transforms (FFT, DCT, Hartley) are applicable depending on network and task (Pan et al., 2022, Pan et al., 2024, Li et al., 2017). "Weight fixation" constrains learning by enforcing that frequency-domain kernels maintain spatial support, reducing overfitting (Pan et al., 2022, Pan et al., 2024).

3. Applications Across Signal Processing and Deep Learning

Digital Signal Processing (DSP)

Frequency-domain convolution is fundamental in efficient DSP, enabling fast FIR filtering, multirate processing, OFDM channelization, and spectral shaping (Johansson et al., 2023, Yli-Kaakinen et al., 2020). Fast convolution with block transforms underpins low-latency, high-throughput radio, and audio pipelines, especially when kernels are long.

Deep Neural Networks

Accelerated Training and Inference: In large-scale CNNs, frequency-domain convolution reduces computation by collapsing $N$ 1 spatial convolutions to minimal per-element multiplies (plus $N$ 2 transform overhead), and provides substantial speedups for large input maps or many filters (Goh et al., 2021, Pan et al., 2024).
All-Frequency Neural Backbones: CEMNet, FDCNN, and TFDMNet implement full or hybrid spectral pipelines, with frequency-domain BatchNorm, Dropout, and nonlinearities. Domain-adaptive weight fixation constrains parameterization (Pan et al., 2022, Pan et al., 2024, Goh et al., 2021).
Entropy Modeling and Compression: WeConv modules in "WeConvene" perform convolution in the DWT domain, reducing intra-subband correlation, improving entropy coding, and yielding significant BD-Rate gains for learned image compression (Fu et al., 2024).
Homomorphic Inference: Frequency-domain homomorphic convolution (NTT-based) is central to privacy-preserving neural network inference and protocols such as ENSEI, dramatically reducing the cost of secure convolution (Bian et al., 2020).
Dynamic Convolution and Attention: Frequency-aware modules such as FADConv and FAT extract DCT-based frequency fingerprints to guide dynamic kernel fusion, outperforming traditional GAP-pooling dynamic conv in attention precision for remote sensing segmentation (Shu et al., 4 Apr 2025).

Multivariate and Time Series Modeling

FTMixer integrates channel-wise DCT-based frequency convolution (FCC) and local windowed spectral convolution (WFC) to combine global and local dependencies in time series, producing state-of-the-art forecasting with linear complexity (Li et al., 2024).

4. Variants: Real-Valued, Complex, Wavelet, and Finite-Field Convolutions

Real vs. Complex Transforms: Use of DCT or Hartley yields real-valued spectra, avoiding the overhead of complex-value support, and simplifying subsequent operations and normalization (Li et al., 2017, Li et al., 2024).
Wavelet-Domain Convolution: WeConv applies convolution in the DWT domain, with band-split filtering and iDWT reconstruction. The LL band processes coarse structure; concatenated HF subbands process edges/textures. DWT/iDWT are efficiently implemented as short FIR filter banks (Fu et al., 2024).
Finite-Field and CRT-Based: For cryptosystems and combinatorial signal generators, frequency-domain convolution in finite fields is further accelerated via the Chinese Remainder Theorem, decomposing large DFTs into smaller, efficiently-computable pieces (Khan et al., 2019).
Dynamic and Frequency-Aware Modules: Convolutions guided by frequency fingerprints (e.g., DCT patches) allow each layer to adapt kernel weights to multi-frequency energy distributions, outperforming static and GAP-based dynamic approaches (Shu et al., 4 Apr 2025).

5. Computational Complexity, Tradeoffs, and Empirical Outcomes

Application Domain	Per-Layer Cost	Frequency-Domain Speedup	Notes
FIR Filtering (large L)	$N$ 3	$N$ 4 vs. direct	Optimal for all practical filter sizes (Johansson et al., 2023)
CNN Layer (image)	$N$ 5 (spatial); $N$ 6 (F-domain)	Up to order-of-magnitude	Overhead amortizes for large $N$ 7, $N$ 8 (Pan et al., 2024, Goh et al., 2021)
Privacy-Preserving NN	$N$ 9	$y[n] = (x * h)[n] = \sum_{m=0}^{N-1} x[m] h[n-m].$ 0– $y[n] = (x * h)[n] = \sum_{m=0}^{N-1} x[m] h[n-m].$ 1 faster	NTT/HE-based secure inference (Bian et al., 2020)
Image Compression	$y[n] = (x * h)[n] = \sum_{m=0}^{N-1} x[m] h[n-m].$ 2– $y[n] = (x * h)[n] = \sum_{m=0}^{N-1} x[m] h[n-m].$ 3 overhead	$y[n] = (x * h)[n] = \sum_{m=0}^{N-1} x[m] h[n-m].$ 4– $y[n] = (x * h)[n] = \sum_{m=0}^{N-1} x[m] h[n-m].$ 5 BD-Rate gain	DWT-domain decorrelation/entropy coding (Fu et al., 2024)
Remote Sensing	$y[n] = (x * h)[n] = \sum_{m=0}^{N-1} x[m] h[n-m].$ 6M extra FLOPs	$y[n] = (x * h)[n] = \sum_{m=0}^{N-1} x[m] h[n-m].$ 7– $y[n] = (x * h)[n] = \sum_{m=0}^{N-1} x[m] h[n-m].$ 8pt F1/IoU gain	DCT-based channel attention (FADConv) (Shu et al., 4 Apr 2025)

Empirical results confirm that frequency-domain approaches, with appropriate weight masking and domain-specific adaptations, can match or surpass spatial baselines on classification, segmentation, compression, and time series tasks, often with lower memory and compute cost (Pan et al., 2024, Goh et al., 2021, Li et al., 2024).

6. Challenges, Limitations, and Extensions

Nonlinearities: Activation functions (ReLU, sigmoid) do not commute with most spectral transforms. Remedies include spectralized nonlinearities (amplitude-normalized or Laplace-domain), or alternation between spatial and frequency domains with additional transforms (Francesca et al., 2016, Crasmaru, 2018).
Parameter Explosion: Zero-padding small kernels to input size in frequency-domain nets can unmask excessive parameters; weight fixation or masking is required (Pan et al., 2022, Pan et al., 2024).
Overfitting Control: Complex/frequency-domain Dropout, BatchNorm, and variance control are implemented per-real/imag part or by multiplicative Gaussian noise (Pan et al., 2022, Pan et al., 2024).
Precision and Artifacts: Quantization in spectral coefficients or mismatches in transform sizes (e.g., $y[n] = (x * h)[n] = \sum_{m=0}^{N-1} x[m] h[n-m].$ 9) induces distortion and aliasing; block-size selection and quantization error analysis are required (Johansson et al., 2023). Circular convolution in the frequency domain can induce spatial artifacts unless trimmed or corrected (Goh et al., 2021).
Domain Mismatch: Some operations—e.g., generic FC layers or loss functions—lack efficient frequency counterparts, necessitating domain transitions or specialized adaptations (Francesca et al., 2016, Li et al., 2017).
Edge Handling: Border effects are managed by symmetric extension, zero-padding, or explicit signal extension before transformation as in DWT-based layers (Fu et al., 2024).

7. Prospects and Research Directions

Frequency-domain convolution remains a vital principle for both classical and modern signal processing. In deep learning, ongoing lines include:

Entirely spectral architectures with minimal domain transitions (Pan et al., 2024, Li et al., 2017).
Adaptation of dynamic and attention modules to leverage richer frequency fingerprints beyond DC terms (e.g., small DCT blocks, learnable spectral bases) (Shu et al., 4 Apr 2025).
Design of frequency-adaptive or hybrid time/frequency networks that balance memory, compute, and expressive power, as in TFDMNet and FTMixer (Pan et al., 2024, Li et al., 2024).
Exploitation of spectral convolution in encrypted domains for privacy-preserving applications (Bian et al., 2020).
Extensions to non-Euclidean domains, e.g., graphs or finite fields, via spectral graph theory or CRT-based decompositions (Khan et al., 2019).

Advances in transform efficiency, hardware acceleration, and domain-adapted learning are expected to further integrate frequency-domain convolution as a default methodology in large-scale, adaptive, and secure signal processing and learning architectures.