Wavelet Neural Operators (WNOs)

Updated 31 October 2025

Wavelet Neural Operators (WNOs) are neural operator frameworks that use wavelet-domain parameterization to capture multiscale, localized features in operator learning for PDEs.
They efficiently perform feature lifting, wavelet-domain convolution, and projection, achieving O(N) complexity and robust handling of discontinuities.
WNOs excel in applications ranging from PDE modeling and image processing to uncertainty quantification by leveraging localized wavelet transforms.

A Wavelet Neural Operator (WNO) is a neural operator framework that performs multiscale convolution by parametrizing operator kernels in the wavelet domain, leveraging wavelets’ unique spatial-frequency localization properties. Originating within operator learning for partial differential equations (PDEs), WNOs are designed to overcome the limitations of both Fourier Neural Operators (FNOs) and classical deep neural operator architectures by capturing nonstationary, localized, and hierarchical structures efficiently across various scientific and engineering applications.

1. Mathematical Foundations and Operator Learning Principle

The core objective in operator learning is to approximate a nonlinear map between infinite-dimensional function spaces. For a domain $D \subset \mathbb{R}^n$ , WNOs are structured to learn an operator

$\mathcal{D}: \mathcal{A} \ni \boldsymbol{a}(x) \mapsto \boldsymbol{u}(x) \in \mathcal{U}$

where $\mathcal{A}$ , $\mathcal{U}$ are function spaces (e.g., Banach spaces), and $\mathcal{D}$ parameterizes the nonlinear solution operator to a family of PDEs.

The functional update in WNOs at layer $j$ is given by

$v_{j+1}(x) = f\left( (K(a;\phi) * v_j)(x) + W v_j(x) \right)$

where $v_j$ is the lifted latent representation and $K(\cdot)$ is a kernelized integral operator. Distinctively, WNO parameterizes the kernel convolution in the wavelet domain using the pair: $K(x)(s) = W^{-1} (W(x) \cdot W(k))(s), \quad s \in D$ with $W, W^{-1}$ denoting forward and inverse wavelet transforms, and $\cdot$ indicating pointwise multiplication in the wavelet coefficient domain (Nekoozadeh et al., 2023, Tripura et al., 2022).

2. Architecture and Implementation Details

WNO layers perform feature lifting, wavelet-domain kernel convolution, and projection to the output space:

Inputs $a(x)$ are lifted to higher dimensions: $v_0(x) = L(a(x))$
Iterative update via wavelet-integral convolution as above.
Projection to output $u(x)$ via a shallow projection network.

The discrete workflow for each layer involves:

Discrete Wavelet Transform (DWT) of features (e.g., using Haar or Daubechies wavelets)
Multichannel convolution in the highest scale subbands
Application of a learned kernel filter in wavelet space, often with a single trainable scale for noise robustness
Inverse DWT to reconstruct spatial features
Nonlinearity (e.g., GeLU)

Skip connections and spatial convolutions are sometimes integrated for improved stability and detail recovery, as in Multiscale Wavelet Attention (MWA) or U-Net enhanced variants (Nekoozadeh et al., 2023, Lei et al., 15 Aug 2024).

Implementation features:

Linear computational complexity in sequence size for 2D data, compared to the $O(N \log N)$ or $O(N^2)$ cost in FNOs or attention mechanisms (Nekoozadeh et al., 2023).
Choice of wavelet (Haar for speed or Daubechies for higher accuracy).
Coefficient parameterization typically restricted to the highest decomposition level, balancing locality and denoising (Tripura et al., 2022, Lei et al., 15 Aug 2024).

3. Comparisons and Theoretical Advantages

WNOs offer significant advantages over comparable architectures:

Aspect	FNO	WNO	DeepONet
Basis domain	Fourier (global)	Wavelet (space-frequency local)	Neural, spatial
Discontinuity/edge modeling	Poor	Strong (localized)	Topology-dependent
Complexity per layer	$O(N \log N)$ , global	$O(N)$ , local+multiscale	Varies
Parameter sharing	Global, aliasing-prone	Multiscale, hierarchical	Depends
Data efficiency	Moderate	Higher (feature selection by scale)	Lower, more samples

Wavelets’ space-frequency localization enables learning of discontinuous, multiscale, and spatially inhomogeneous patterns, bypassing the global nature of Fourier features (Tripura et al., 2022, Nekoozadeh et al., 2023).
WNOs excel at extracting fine details and edges, essential for modeling natural images, turbulent flows, multiphysics PDEs, and signal processing tasks.
Robustness to high-frequency noise is achieved by parameterizing only high-level wavelet coefficients.
WNOs can operate on arbitrary input grid resolutions (grid-invariance) and perform zero-shot super-resolution (Tripura et al., 2022, Soin et al., 11 May 2024).

4. Extensions, Variants, and Recent Innovations

Recent research has led to multiple advanced WNO architectures and frameworks:

Multiscale Wavelet Attention (MWA) and Vision Transformers: WNOs replace self-attention, resulting in higher accuracy and linear complexity in image recognition tasks on CIFAR/Tiny-ImageNet (Nekoozadeh et al., 2023).
Multi-fidelity WNO (MF-WNO): Learns from limited high-fidelity and abundant low-fidelity data, outperforming MF-DeepONet and single-fidelity WNO by 1–3 orders of magnitude on surrogate modeling tasks, with robust uncertainty quantification (Thakur et al., 2022).
U-WNO: Integrates U-Net style multi-path convolution inside each wavelet layer for exceptional recovery of high-frequency/spatial details, mitigating spectral bias (Lei et al., 15 Aug 2024).
Randomized Prior WNO (RP-WNO): Embeds randomized prior ensembles for scalable, accurate epistemic and aleatoric uncertainty quantification, matching or exceeding Bayesian deep ensembles (Garg et al., 2023).
Wavelet Diffusion Neural Operator (WDNO): Incorporates diffusion models directly in wavelet space, enabling superior handling of abrupt PDE discontinuities and zero-shot super-resolution for both simulation and control (Hu et al., 6 Dec 2024).
Combinatorial/Foundational WNOs (NCWNO): Modular, expert-ensemble architectures for multi-task, continual learning with robust transfer and no catastrophic forgetting (Tripura et al., 2023).
NAS for WNOs (FWNO): Automated architecture discovery using generative flow networks for optimal per-layer wavelet and activation selection, reducing search complexity and improving test accuracy (Soin et al., 11 May 2024).
Physics-Informed WNO (PIWNO): Leverages unsupervised, PDE-residual loss for label-free operator learning, achieving high accuracy and data efficiency across nonlinear, parametric systems (N et al., 2023).
Spiking WNOs (VS-WNO): Employs event-driven variable spiking neurons, yielding sparse, low-energy computation with negligible loss in accuracy, suitable for edge and embedded scientific computing (Garg et al., 2023).

5. Applications and Empirical Performance

WNOs have been empirically validated on:

Canonical PDE systems: Burgers’, Darcy flow (rectangular and nonrectangular domains), Allen–Cahn, Navier–Stokes, Poisson, Advection, and the Helmholtz equation (Tripura et al., 2022, Thakur et al., 2022, N et al., 2023).
Multiscale, chaotic, and discontinuous systems: Demonstrated best-in-class performance on sharp-interface regimes, rough initial data, and spatiotemporal turbulence.
Vision: Significant improvements over global Fourier-based attention modules for vision transformers (Nekoozadeh et al., 2023).
Surrogate modeling for UQ: State-of-the-art data efficiency and predictive uncertainty measures (Thakur et al., 2022, Garg et al., 2023).
Control and generative modeling: WDNO achieves lowest error long-term fluid simulations and sharply outperforms SOTA in indirect optimization (e.g., smoke leakage) and real-world climate data (Hu et al., 6 Dec 2024).
Digital twin and real-time prediction: Accurate, two-orders-of-magnitude faster prediction of floating offshore structures under both regular and irregular wave conditions (Cao et al., 2023).
Scientific meta-learning: Compressed, physically aware representations of Green’s functions and pseudo-differential operators achieved using nonstandard wavelet forms and meta-learned wavelet transforms (Feliu-Faba et al., 2019).
Signal processing: Learnable wavelet filterbanks for nonstationary signal classification and time-frequency interpretation (Stock et al., 2022).

Task	Baseline Error	WNO-based Error	Gain
PDE operator learning (Darcy, Navier–Stokes, Allen-Cahn)	1–18% (FNO/DeepONet/MT)	0.2–0.8%	up to 10x lower
Vision transformer (CIFAR-100, ViT-B)	62.2–72.8% (SA/AFNO/GFN top-1)	75.3% (MWA/WNO)	+2–13% accuracy
UQ (Burgers’, Darcy)	0.019–0.013	0.0025–0.0000408	5–10x lower
Control (2D fluid, smoke leakage)	–	78% reduction vs prior	SOTA

6. Limitations and Research Directions

While WNOs exhibit superior performance for multiscale and nonstationary phenomena, some notable caveats and current research directions are:

High-frequency information loss: Standard WNOs, parameterizing only at the highest scale, may underfit sharp details. U-Net enhancements and adaptive activations address this (Lei et al., 15 Aug 2024).
Label-free operator learning: Physics-informed loss terms outcompete data-driven methods in data-scarce regimes but may incur additional complexity (N et al., 2023).
Architecture selection: Performance is sensitive to the choice of wavelet, activation, decomposition depth, and block configuration. Automated search via GFlowNet-based NAS (FWNO) systematically optimizes these choices (Soin et al., 11 May 2024).
Uncertainty quantification: Ensemble and randomized prior methods provide scalable solutions; Bayesian inference remains computationally prohibitive for large neural operators (Garg et al., 2023).
Nonuniform grids and unstructured domains: Extensions to geometric wavelets and mesh-based WNOs are topics of ongoing research (Hu et al., 6 Dec 2024).
Hardware and edge deployment: Event-driven and spiking designs (VS-WNO) aim to lower computation and energy requirements for edge and embedded scientific environments (Garg et al., 2023).
Methodological integration: The multiscale, telescopic width/depth separation concepts from wavelet-based neural networks (WBNNs) are being adapted for even more expressive and compressible WNOs, supporting faster and more robust training (Dechevsky et al., 2022).

7. Summary Table: Core WNO Variants and Features

Variant/Extension	Key Feature	Main Application/Impact
Vanilla WNO	Wavelet-based kernel integral, O(N) complexity	General PDE, image, and signal operator
MWA (ViT)	Multiscale wavelet attention for transformers	Vision classification, efficient ViTs
MF-WNO	Multi-fidelity learning, UQ	Low-data/high-dim parametric PDEs
PI-WNO	Physics-informed, label-free training	Data-scarce, physically constrained PDEs
RP-WNO	Uncertainty quantification via RPN	UQ for PDE and scientific surrogates
U-WNO	U-Net path & adaptive activations	High-frequency, oscillatory PDE solution
NCWNO	Multi-task/continual learning, expert ensemble	Foundational operator learning
WDNO	Wavelet-domain diffusion, multi-resolution	Long-term, abrupt PDE sim/control, SOTA
VS-WNO	Spiking, sparse event-driven neurons	Low-energy, edge/embedded computation
FWNO	GFlowNet NAS for per-layer search	Automated model selection, higher accuracy

WNOs now serve as an organizing principle for operator learning, multiscale surrogate modeling, uncertainty quantification, neural attention mechanisms, and scientific transfer learning, establishing a foundation for robust, efficient, and generalizable neural models in computational and data-driven science.