Zero-Query Frequency-Domain Initialization

Updated 24 January 2026

Zero-query frequency-domain initialization is a method that leverages prior spectral knowledge to set model parameters without querying training samples, effectively overcoming spectral bias.
It employs techniques such as spectral envelope matching, direct Fourier synthesis, and variance preservation to ensure balanced signal propagation and rapid convergence.
This approach demonstrates enhanced performance in applications like operator learning, time-series forecasting, adversarial attacks, and massive MIMO systems.

Zero-query frequency-domain initialization refers to a class of strategies that set model parameters solely by exploiting prior frequency-domain knowledge, without querying training samples or performing task-specific optimization loops. By leveraging spectral properties of data and model structure, these approaches can significantly improve convergence speed, generalization, and stability across neural architectures, signal processing, operator learning, time-series forecasting, adversarial attacks, and massive MIMO systems.

1. Motivation: Spectral Bias and Model Performance

Many parameterized models, especially deep neural networks, exhibit a marked “spectral bias,” meaning they preferentially learn or propagate low-frequency (coarse) information in early stages and struggle with high-frequency (fine) details unless specifically adjusted. In implicit neural representations (INRs), this bias manifests as overly smooth reconstructions or poor recovery of high-frequency content, even as deeper or more expressive networks are used. Frequency-aware initialization designs aim to compensate for these biases directly at initialization, enabling models to represent the full spectrum of target signals without extensive hyperparameter searches or excessive training (Kania et al., 2024, Homma et al., 4 Nov 2025).

Zero-query frequency-domain initialization, therefore, is motivated by the need to overcome spectral bias or frequency misalignment in a resource- and time-efficient manner. This paradigm finds application in operator learning (spectral neural operators), convolutional architectures, decision-based attacks against image detectors, zero-shot time-series forecasting, and wireless channel estimation—demonstrating broad relevance across computational disciplines (Nochumsohn et al., 2024, Poli et al., 2022, Chen et al., 10 Dec 2025, Aghajanyan, 2017, Choi et al., 2020).

2. General Algorithmic Principles and Representative Methods

Zero-query frequency-domain initialization methods share several core principles:

No training sample queries or gradient-based optimization: All parameter assignment is determined algebraically from prior knowledge, synthetic surrogates, or Fourier/harmonic structure analysis.
Spectral envelope matching: Select initialization or embedding parameters to match the frequency spectrum of the model’s output—prior to training—to that of the target signal or class, often through explicit Fourier analysis or associated metrics such as the Wasserstein distance.
Direct synthesis in frequency domain: Filters or weights are sampled or orthogonalized in the Fourier (or DCT) domain, with subsequent transformation (e.g., IFFT) to obtain spatial or temporal representations.
Variance and norm preservation: Initialization rules are analytically derived to ensure signal or activation statistics are preserved through frequency-domain layers or truncated spectral bases.

Prominent instantiations include:

FreSh (Frequency Shifting): Determines MLP positional embedding hyperparameters by minimizing the Wasserstein distance between the initial and target signal spectra, thereby aligning the network’s innate frequency coverage with that of the data (Kania et al., 2024).
SWIM-based spectral initialization: Scales per-layer activations to encode low-to-high frequency content in a principled depth-wise manner, with all frequency-domain structure imposed via sampling and scaling schedules (Homma et al., 4 Nov 2025).
Variance-preserving (vp) T1 weighting: Analytically sets spectral weights to maintain global signal variance in frequency-truncated linear operators, preventing vanishing/exploding statistics in operator learning models (Poli et al., 2022).
Convolution-Aware Initialization (CAI): Constructs convolutional filters that are orthonormal in the Fourier domain and then maps them back to the spatial domain with the inverse FFT (Aghajanyan, 2017).
DCT-based spectral band partitioning for black-box adversarial attacks: Initializes query directions or attack subspaces by selecting appropriate DCT coefficient bands, with ensemble-based “adversarial example soup” for maximally effective zero-query attacks (Chen et al., 10 Dec 2025).

3. Detailed Algorithmic Procedures

3.1 FreSh: Spectrum Alignment via Wasserstein Distance

Evaluate the untrained (randomly initialized) model’s output over the domain grid, compute the 2D DFT, and collapse frequency magnitudes into a direction-invariant spectrum vector $S_n$ .
Normalize spectra to unit $\ell_1$ -norm for comparison.
Compare candidate embedding parameterizations (e.g., frequency scales for sinusoidal encodings) by the Wasserstein distance between $S_n$ for the model and target signal.
Select the configuration minimizing averaged Wasserstein distance over multiple initialization samples.
Empirically, this alignment of spectral envelopes is highly predictive of ultimate model fidelity, enabling single-run performance competitive with exhaustive grid search at a fraction of the cost (Kania et al., 2024).

3.2 SWIM Spectral Scheduling

Hidden layer weights and biases are assigned using a reservoir sampling procedure (“sampling where it matters”) to prioritize regions critical to function representation.
Each layer is assigned a scale factor $s_{1,l}$ , increasing with depth to encode progression from low-frequency to high-frequency content.
All feature extraction and parameter selection is performed in a single sweep, with only a final linear layer optimized (e.g., by least-squares), and no backpropagation (Homma et al., 4 Nov 2025).

3.3 Frequency-domain Operator Layer Initialization

For layers operating in the spectral domain (e.g., after DCT/DFT of inputs), the learnable matrix $A \in \mathbb{R}^{m \times m}$ is initialized with entries $A_{ij} \sim \mathcal{N}(0, N/m^2)$ (for signal of size $N$ and truncated to $m$ modes), exactly preserving input variance in the transformed space.
For complex weights in fully-complex settings, standard deviation is decreased by $\sqrt{2}$ to account for the variance contribution of real and imaginary parts.
This initialization prevents cumulative variance mismatch that otherwise degrades deep frequency-domain architectures (Poli et al., 2022).

3.4 Convolution Aware Initialization (CAI)

Generate random filter tensors.
For each frequency index, orthonormalize (via QR or SVD) the filter bank in the Fourier domain to ensure energy preservation.
Transform the orthonormalized filters back to the spatial domain via an inverse FFT.
This construction is “zero-query” and ensures balanced spectral propagation throughout deep convolutional stacks, with increased convergence speed and stability (Aghajanyan, 2017).

3.5 Frequency-band Partitioning in Adversarial Attacks

Transform the image into the DCT domain.
Partition coefficients into low- and high-frequency bands, selecting bands empirically best suited for real or GAN-generated images (e.g., “10% L + 10% H” for real, “20% L” for generated).
Initialize query directions and “adversarial example soups” within the selected frequency subspace.
Ensemble-average surrogate-model adversarial samples to maximize zero-query attack success, minimize per-query distortion, and improve perceptual quality (Chen et al., 10 Dec 2025).

4. Empirical Results Across Domains

Empirical work documents the spectrum of impact that zero-query frequency-domain initialization delivers:

Method	Key Domain	Performance Impact
FreSh	INRs/Image/Video	Near-grid-search PSNR with $\sim$ 1/10th time
SWIM Spectral	Regression/MNIST	$20\text{–}30\%$ RMSE reduction in regression, $>1\%$ error reduction in large MNIST nets
VP T1-FDM	Operator Learning	$20\%$ reduction in relative $L_2$ error, $3\text{–}10\times$ train speedup
CAI	Deep Convnets	State-of-the-art accuracy, faster convergence (CIFAR-10, SVHN)
AES-init (FBA $^2$ D)	AIGC Detection	Approaches ceiling attack success rate (ASR), highest PSNR/SSIM, minimal queries

Qualitative improvements manifest as reduced per-frequency error (up to 30%) (Kania et al., 2024), sharper high-frequency detail recovery, improved generalization to unseen spectra in forecasting (Nochumsohn et al., 2024), and substantial efficiency gains in training or decision-based attack settings (Chen et al., 10 Dec 2025).

5. Theoretical Underpinnings and Analytic Properties

Zero-query frequency-domain methods are undergirded by several theoretical phenomena:

Spectral Alignment Theory: Alignment of model output spectrum at initialization with that of the target guarantees immediate support for relevant modes, decreasing required learning iterations and mitigating spectral bias (Kania et al., 2024).
Variance Preservation: Analytical design of initialization (e.g., VP-T1) prevents cumulative vanishing/exploding statistics when truncating or modifying spectral components, a behavior not addressed by classical Xavier/He initializations (Poli et al., 2022).
Orthogonality in the Frequency Domain: Ensures that activations neither grow nor contract as signals propagate across layers, preserving gradient flow and representation capacity even in deep architectures (Aghajanyan, 2017).
Frequency Coverage and Generalization: In time series zero-shot forecasting, synthetic datasets are designed to cover (span or envelope) the fundamental and harmonic frequencies of prospective real tasks, enhancing frequency generalization and robustness against frequency confusion (Nochumsohn et al., 2024).
A plausible implication is that broader adoption of these techniques will further shift deep learning toward spectrally-regularized or spectrum-driven paradigms in scenarios where label-driven queries are expensive or sample efficiency is paramount.

6. Broader Applications and Extensions

Zero-query frequency-domain initialization has been successfully adapted in:

Signal processing and communications: HRPE-based frequency-domain parameter extraction for zero-feedback channel estimation in FDD massive MIMO, enabling downlink CSI estimation solely from UL pilots and spectral extrapolation (Choi et al., 2020).
Operator learning and PDE surrogates: Streamlined learning of complex operators in fluid dynamics using frequency-domain variants of graph neural networks or transformer architectures (Poli et al., 2022).
Time-series forecasting and transfer learning: Synthetic spectrum-driven datasets for robust training and transfer without real sample queries, outperforming data-intensive baseline pretraining (Nochumsohn et al., 2024).
Adversarial robustness: Frequency partitioning and spectral-aware initializations to mount zero-query or low-query black-box attacks with near-maximal efficiency and imperceptibility (Chen et al., 10 Dec 2025).
Vision, audio, and NLP: Deep convolutional architectures for image, speech, and NLP tasks, leveraging frequency-domain initialization for faster, more stable learning and higher final accuracy (Aghajanyan, 2017).

7. Implementation Guidelines and Practical Considerations

Common recommendations for practitioners include:

Employ fast Fourier (FFT, DCT) and linear algebra libraries for all spectral analysis and transformation steps.
For embedding hyperparameter selection, use spectrum sizes $n\in[32,128]$ , and candidate grids as appropriate for the domain: e.g., SIREN $\omega_0\in[10,200]$ , Fourier features $\sigma\in[1,20]$ (Kania et al., 2024).
For CAI, perform orthonormalization per-frequency and convert back via IFFT, with manageable computational overhead given typical filter sizes (Aghajanyan, 2017).
In forecasting, estimate fundamental periods from sampling rates or employ mixture/harmonic variants if rates are unknown or variable (Nochumsohn et al., 2024).
Average over multiple random initializations or candidate sets to reduce stochasticity; use ablation experiments to optimize spectral partitioning for attacks (Chen et al., 10 Dec 2025).
In massive MIMO, limit HRPE path count $L$ to the minimal number yielding $< -10$ dB MSE in training band, with VSS models preferred outdoors and in LOS conditions (Choi et al., 2020).

Zero-query frequency-domain initialization thus establishes a principled, computationally efficient foundation for spectral regularization and capacity alignment across learning tasks, supporting both data-scarce and large-scale applications.