Wavelet Score-Based Models

Updated 25 January 2026

Wavelet score-based models are generative frameworks that leverage multiresolution wavelet transforms and score-based diffusion to model data distributions with conditional independence across scales.
They employ a multistep sampling scheme from coarse to fine scales, significantly boosting computational efficiency and stability through per-scale SDEs.
Applications span image synthesis, restoration, time-series generation, and 3D MRI artifact correction, achieving state-of-the-art performance and dramatic inference acceleration.

Wavelet score-based models constitute a class of generative and restorative probabilistic frameworks that leverage the multiresolution properties of wavelet transforms together with the theoretical machinery of score-based diffusion models. By parameterizing or learning data distributions and scores directly in the wavelet domain, these models exploit conditional independence across scales and spatial/frequency localization, yielding improved computational efficiency, stability, and enhanced fidelity in various signal and image processing tasks. Pioneering developments include rigorous mathematical analyses, novel network architectures, and accelerated sampling strategies, producing state-of-the-art results in image synthesis, restoration, time series generation, and artifact correction.

1. Mathematical Foundations of Wavelet Score-Based Modeling

Wavelet score-based models extend classical score-based generative models (SGMs)—which learn the gradient (score) of the logarithmic data distribution, $\nabla_x \log p_t(x)$ —to the wavelet domain. In the canonical SGM, a forward stochastic differential equation (SDE) incrementally destroys data structure via Gaussian noise, while the reverse-time SDE uses a learned score function to synthesize data:

$dX_t = f(X_t, t) dt + g(t) dW_t,\quad X_0 \sim p\,;$

$dY_s = \left[f(Y_s,T-s) - g(T-s)^2\,\nabla\log p_{T-s}(Y_s)\right] ds + g(T-s) d\widetilde W_s\,.$

Wavelet-based factorization further decomposes $x$ into multiscale coefficients $w = (x_J, \bar x_J, ..., \bar x_1)$ using an orthonormal transform $W$ . The resulting probability distribution factorizes across scales as $p(x) = \prod_\ell p(w_\ell | w_{>\ell})$ , enabling the definition of conditional, scale-wise score functions

$s_\ell(w_\ell; w_{>\ell},t) = \nabla_{w_\ell} \log p_t(w_\ell | w_{>\ell})\,,$

and independent SDEs per scale. This approach guarantees that under suitable spectral assumptions (e.g., for Gaussian fields with power-law spectra), the condition number of the covariance in the wavelet domain is uniformly bounded, allowing for step counts $N = O(1)$ per scale—independent of data dimensionality (Guth et al., 2022).

2. Algorithmic Structure and Sampling Schemes

Wavelet score-based generative modeling (WSGM) adopts a multistep generative procedure. Sampling proceeds coarsest to finest scale:

Sample coarsest coefficients $x_J$ from a standard normal distribution, then apply the discretized reverse SDE at each timestep.
For each detail scale $j$ , sample conditional coefficients $\bar x_j$ given $x_j$ , using the learned conditional score network.
At each scale, invert wavelet coefficients back to the spatial domain.
Use uniform Euler–Maruyama discretization for each reverse SDE, applying the same time grid across all scales, with updates:

$w_{\ell}^{k+1} = w_{\ell}^k + \delta\,[w_{\ell}^k + 2\,s_\ell(w_{\ell}^k | w_{>\ell}, t_k)] + \sqrt{2\delta}\,z_\ell^{k+1},\quad z_\ell^{k+1} \sim \mathcal N(0, I).$

Notably, all wavelet levels utilize equal numbers of time steps, and total sampling complexity is linear in signal dimensionality. Extensions include adaptive time stepping, higher-order SDE solvers, and plug-and-play score estimation (Guth et al., 2022).

In practical implementations for images (e.g., WaveDM), the process selectively diffuses only low-frequency bands while conditioning on both degraded spectrum and high-frequency refinements. For time series (WaveletDiff), DWT is recursively applied, and denoising diffusion models are independently defined per level (Huang et al., 2023, Wang et al., 13 Oct 2025).

3. Architectures, Conditioning, and Spectral Constraints

Architectures for wavelet score-based modeling are tightly coupled to the multi-scale structure of the wavelet domain:

UNet-based Score Networks: Used in natural image tasks (e.g., WaveDM), where separate light-weight and WideResNet-style UNet modules handle high- and low-frequency wavelet bands, with bottleneck attention and explicit concatenation of degraded spectrum and predicted high-frequency bands as inputs.
Multilevel Transformers: Employed in time series synthesis (WaveletDiff), with a dedicated "LevelTransformer" per wavelet scale. Cross-level attention and adaptive gating enable selective exchange among scales; Parseval energy preservation constraints are enforced at each level to align synthesized spectral content with true data (Wang et al., 13 Oct 2025).
Wavelet Convolution: Some models (e.g., 3D-WMoCo) replace standard convolutional layers with convolutions performed directly on DWT subbands, exponentially increasing receptive field without additional parameter explosion (Zhang et al., 4 Nov 2025).

Conditioning strategies are tailored to each modality: image restoration networks concatenate predicted and degraded spectral information into the score predictor (Huang et al., 2023); 3D MRI restoration models (3D-WMoCo) employ orthogonal 2D score priors for different anatomical planes, combined pseudo-3D by alternating priors across timesteps (Zhang et al., 4 Nov 2025).

4. Accelerated Inference and Complexity Analysis

A major thrust of wavelet score-based models is dramatic acceleration of inference relative to conventional spatial-domain SGMs. Theoretical analysis demonstrates that for power-law spectra and near-Gaussian multiscale distributions, the number of steps per scale does not scale with input size, unblocking $O(d)$ end-to-end inference (Guth et al., 2022).

For instance, WaveDM’s Efficient Conditional Sampling (ECS) halts the reverse process early (e.g., after 5 steps out of a possible 1000), relying on the well-conditioned wavelet domain and band separation to reconstruct images with minimal quality degradation (Huang et al., 2023). MRI artifact correction shows corresponding reductions from hours (SDE-MRI, PFAD) to minutes for large 3D medical volumes (Zhang et al., 4 Nov 2025).

The empirical comparison table illustrates typical speedups and restoration gains:

Task / Dataset	WaveDM (Sec/Image)	PatchDM (Sec/Image)	Speedup
RainDrop (720×480)	0.30	24	×80
DPDD defocus	0.47	365	×777
SIDD real-denoise	0.062	9.3	×150

Similar complexity improvements are rigorously demonstrated for WSGM in synthetic and physical systems (Guth et al., 2022).

5. Applications in Image, Signal, and Medical Data Domains

Wavelet score-based models find applications across image restoration, synthesis, time series generation, and inverse problems:

Image Restoration and Synthesis: WaveDM achieves state-of-the-art PSNR and SSIM across raindrop removal, rain streak, dehazing, deblurring, demoiréing, and denoising, outperforming patch-based diffusion by >100× in speed (Huang et al., 2023).
Time Series Generation: WaveletDiff outperforms all prior time- and frequency-domain generative models, attaining threefold reductions in discriminative and Context-FID scores and improved correlation/predictive metrics across energy, finance, and neuroscience datasets (Wang et al., 13 Oct 2025).
3D MRI Artifact Correction: 3D-WMoCo establishes new benchmarks for motion artifact removal in clinical MRI, combining wavelet-domain SDE, orthogonal 2D priors, and wavelet convolution to achieve PSNR gains >6 dB and runtime reductions from 1184 min (SDE-MRI) to 14 min for $240^3$ volumes (Zhang et al., 4 Nov 2025).
Scientific and Physical Systems: WSGM yields domain-optimal complexity in Gaussian random fields and physical models near phase transitions, such as the critical $\varphi^4$ Gibbs field (Guth et al., 2022).

6. Limitations, Theoretical Guarantees, and Extensions

Wavelet score-based models presuppose the data possess strong multiscale or approximately Gaussian structure in the wavelet domain; performance may be suboptimal with highly non-Gaussian or weakly multiscale phenomena (Guth et al., 2022). Other caveats include sensitivities to wavelet family selection, boundary handling, and inter-scale overlap.

Theoretical results substantiate $O(1)$ per-scale step count under spectral regularity, with explicit bounds on KL divergence, discretization error, and spectrum preservation. Extensions include adaptive step size, higher-order solvers, plug-and-play score combination, and adaptation to other inverse problems such as MRI/CT reconstruction and data near criticality (Guth et al., 2022, Zhang et al., 4 Nov 2025).

7. Comparative Evaluation and Practical Impact

Across all reported domains, wavelet score-based models deliver consistent, order-of-magnitude inference acceleration and fidelity enhancement relative to both vanilla diffusion/score-based models and single-scale generative networks. Through explicit multiscale coefficient modeling and domain-aware architectural innovations, these models have redefined the practicality of diffusion-based inference for scientific data, medical imaging, and real-world synthetic data generation (Huang et al., 2023, Wang et al., 13 Oct 2025, Guth et al., 2022, Zhang et al., 4 Nov 2025).

Markdown Report Issue Upgrade to Chat

References (4)

Wavelet Score-Based Generative Modeling (2022)

WaveDM: Wavelet-Based Diffusion Models for Image Restoration (2023)

WaveletDiff: Multilevel Wavelet Diffusion For Time Series Generation (2025)

Wavelet-Optimized Motion Artifact Correction in 3D MRI Using Pre-trained 2D Score Priors (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Wavelet Score-Based Models.