Papers
Topics
Authors
Recent
Search
2000 character limit reached

Walsh-Hadamard Neural Operators (WHNO)

Updated 7 April 2026
  • WHNO is a spectral neural operator that uses the Walsh-Hadamard basis to capture sharp discontinuities in PDE solutions.
  • It employs learnable spectral weights on low-sequency coefficients followed by convolutional decoding to achieve high fidelity in benchmark PDE problems.
  • Ensembling WHNO with Fourier operators leverages complementary strengths, yielding up to 35% error reductions for discontinuous phenomena.

The Walsh-Hadamard Neural Operator (WHNO) is a spectral neural operator constructed to approximate solution operators of partial differential equations (PDEs) characterized by discontinuous coefficients or sharp solution features. Unlike standard spectral neural operators based on Fourier transforms, which are highly effective for smooth fields but susceptible to the Gibbs phenomenon around discontinuities, the WHNO leverages the Walsh-Hadamard transform—a basis of orthonormal, piecewise-constant rectangular functions—enabling high-fidelity representation of abrupt jumps and interfaces without spectral ringing. The architecture comprises learnable spectral weights acting on low-sequency Walsh coefficients to capture nonlocal dependencies, followed by a convolutional decoder. Empirical results demonstrate WHNO’s superiority over Fourier-based neural operators when sharp material interfaces are present and further reveal that ensembles combining WHNO and FNO exploit complementary representational properties, achieving substantial error reductions for a suite of benchmark PDEs with discontinuities (Cavallazzi et al., 10 Nov 2025).

1. Mathematical Foundations

1.1 Walsh–Hadamard Basis and Transform

The Walsh functions {wk(x)}k=0∞\{w_k(x)\}_{k=0}^{\infty} constitute an orthonormal basis on [0,1][0,1], each function a rectangular wave taking values in ±1\pm1. Unlike sinusoids, Walsh functions are sequency-ordered: wkw_k has kk zero-crossings, correlating low kk with broad, constant regions and high kk with rapid alternation.

For vectors f∈Rnf \in \mathbb{R}^n, the (normalized) Hadamard matrix Hn∈{±1}n×nH_n \in \{\pm1\}^{n \times n} (with n=2mn=2^m) underpins the discrete Walsh-Hadamard Transform (WHT). Key definitions: - [0,1][0,1]0 - [0,1][0,1]1 - [0,1][0,1]2 (orthonormalization) - One-dimensional WHT: [0,1][0,1]3, with [0,1][0,1]4.

The WHT for [0,1][0,1]5 (continuous, [0,1][0,1]6):

[0,1][0,1]7

For discrete [0,1][0,1]8 on [0,1][0,1]9 grid points ±1\pm10:

±1\pm11

The two-dimensional (2D) transform uses WHT along each axis. The Fast Walsh-Hadamard Transform (FWHT) computes this in ±1\pm12 time.

1.2 Relationship to PDE Discontinuities

Walsh basis functions are uniquely suited to representing piecewise-constant features common in heterogeneous PDEs. The presence of sharp jumps or interfaces yields a sparse Walsh spectrum, supporting efficient low-sequency truncation without significant interface distortion. In contrast, the Fourier basis incurs oscillatory artifacts (Gibbs phenomenon) near discontinuities—requiring orders of magnitude more modes for comparable sharpness.

2. Operator Architecture

2.1 High-Level Pipeline

Given a coefficient field ±1\pm13 on a ±1\pm14 grid (±1\pm15, ±1\pm16 powers of 2), the WHNO workflow:

  1. Input Lifting: Construct ±1\pm17.
  2. Spectral Layers (typically two):
  • Forward 2D WHT: ±1\pm18
  • Spectral Truncation: Retain only ±1\pm19 lowest-sequency coefficients: wkw_k0
  • Learnable Spectral Weights: Affine mixing in spectral domain:

    wkw_k1

  • Zero Padding: Expand to wkw_k2
  • Inverse WHT: wkw_k3
  1. Spatial Mixing & Skip Connections: First layer, no skip; second layer, residual: wkw_k4.
  2. Decoder: Several dilated 2D convolutions act on wkw_k5 to yield output wkw_k6.

2.2 Spectral-Layer Formulae

Let wkw_k7 indicate layer index:

wkw_k8

2.3 Forward Pass Pseudocode

[0,1][0,1]03

2.4 Parameterization

All learnable weights: wkw_k9. Typical model: kk0 parameters (kk1 spectral, kk2 decoder).

3. Training Regimes and Experimental Setup

3.1 Loss and Optimization

Training minimizes mean squared error (MSE) across the spatial domain:

kk3

Optimization: AdamW, learning rate kk4 (cosine decay/step), weight decay kk5, batch size 4 (heat, Darcy), 1–2 (Burgers); 400 epochs.

3.2 Data Generation for Discontinuous PDEs

  • Darcy flow: Binary kk6 with 4 random rectangles (kk7). Solve kk8 with mixed Dirichlet/Neumann boundary conditions.
  • Heat conduction: kk9 matrix kk0, inclusions kk1 or kk2. kk3 in central kk4 region, Dirichlet kk5 on boundary, quasi-steady integration.
  • 2D Burgers: kk6, kk7. Three kk8 blocks with kk9 at kk0, periodic boundary, kk1, kk2 steps, kk3.

3.3 Spectral Truncation and Channelization

Typical spectral truncation: kk4 (i.e., kk5 low-sequency block), 16 encoder channels, 64 decoder channels.

4. Empirical Evaluation and Benchmarks

4.1 Steady-State Darcy Flow

In binary permeability with four obstacles, WHNO achieves kk6 relative error in pressure kk7 on the kk8 test set, with maximal errors localized at obstacle boundaries.

4.2 Heat Conduction with Discontinuous Conductivity

Under identical architectures and training, WHNO outperforms the Fourier Neural Operator (FNO) in all primary error metrics for heat conduction with discontinuous kk9. Summary:

Method MSE Mean Rel. Err. Max Abs Error
WHNO f∈Rnf \in \mathbb{R}^n0 f∈Rnf \in \mathbb{R}^n1 f∈Rnf \in \mathbb{R}^n2
FNO f∈Rnf \in \mathbb{R}^n3 f∈Rnf \in \mathbb{R}^n4 f∈Rnf \in \mathbb{R}^n5
Advantage f∈Rnf \in \mathbb{R}^n6 lower f∈Rnf \in \mathbb{R}^n7 lower f∈Rnf \in \mathbb{R}^n8 lower

Weighted ensemble (f∈Rnf \in \mathbb{R}^n9) combining WHNO and FNO further reduces MSE by Hn∈{±1}n×nH_n \in \{\pm1\}^{n \times n}0 and max error by Hn∈{±1}n×nH_n \in \{\pm1\}^{n \times n}1.

4.3 2D Burgers Equation with Discontinuous Initial Conditions

For Burgers’ equation (Hn∈{±1}n×nH_n \in \{\pm1\}^{n \times n}2), WHNO demonstrates Hn∈{±1}n×nH_n \in \{\pm1\}^{n \times n}3 lower MSE and Hn∈{±1}n×nH_n \in \{\pm1\}^{n \times n}4 lower mean absolute error versus FNO. Ensemble with Hn∈{±1}n×nH_n \in \{\pm1\}^{n \times n}5 realizes Hn∈{±1}n×nH_n \in \{\pm1\}^{n \times n}6 MSE and Hn∈{±1}n×nH_n \in \{\pm1\}^{n \times n}7 MAE reduction:

Method MSE MAE Max Error
WHNO Hn∈{±1}n×nH_n \in \{\pm1\}^{n \times n}8 Hn∈{±1}n×nH_n \in \{\pm1\}^{n \times n}9 n=2mn=2^m0
FNO n=2mn=2^m1 n=2mn=2^m2 n=2mn=2^m3
Ensemble n=2mn=2^m4 n=2mn=2^m5 n=2mn=2^m6

Across all tasks, the WHNO+FNO ensemble consistently achieves n=2mn=2^m7 lower MSE relative to WHNO alone (up to n=2mn=2^m8 over FNO), and reduces error variance (Cavallazzi et al., 10 Nov 2025).

5. Discussion and Design Rationale

5.1 Mitigating the Gibbs Phenomenon

The rectangular Walsh basis inherently represents step-like or piecewise-constant functions exactly, precluding the overshoot or ringing that afflicts Fourier (oscillatory) bases near discontinuities. For a field with explicit jump discontinuities, the Walsh spectrum is sparse; low-sequency truncation preserves interface sharpness, unlike Fourier truncation which requires many retained modes.

5.2 Representation Complementarity

WHNO captures discontinuities and sharp interfaces, while FNO is optimal for smooth oscillatory or gradient-dominated fields. In many physical PDEs, both features coexist—the ensemble leverages the strengths of each: WHNO dominates near interfaces, FNO in smooth interiors. The optimal ensemble weight depends on the proportion of discontinuous versus smooth features (e.g., n=2mn=2^m9 for heat conduction, [0,1][0,1]00 for Burgers).

5.3 Computational Trade-offs

Both WHNO and FNO have [0,1][0,1]01 spectral layer complexity. While an ensemble doubles inference time, no extra training is required and the [0,1][0,1]02 error reduction may be warranted in applications with critical requirements for interface resolution (e.g., composite material design, subsurface flow in fractured media).

  • Use WHNO alone when discontinuities predominate and inference speed is a constraint.
  • Use WHNO+FNO ensemble for maximal accuracy and broad robustness across discontinuous and smooth regions.
  • Use FNO alone for strongly smooth (e.g., Gaussian random field) coefficients or when discontinuities are absent.

6. Summary

The Walsh-Hadamard Neural Operator is a spectral neural operator targeting PDEs with discontinuous coefficients or sharp local features. Its rectangular wave basis and low-sequency spectral weights enable efficient, direct learning of sharp interfaces, eliminating Fourier-induced artifacts. For heterogeneous PDEs, ensembling WHNO with FNO exploits complementary basis properties, delivering state-of-the-art accuracy and robustness with moderate computational overhead (Cavallazzi et al., 10 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Walsh-Hadamard Neural Operators (WHNO).