Wavelet-Based Multi-Band Fusion (LIWT)

Updated 28 May 2026

Wavelet-Based Multi-Band Fusion (LIWT) is a technique that decomposes images using multiscale wavelet transforms to integrate high-resolution panchromatic data with multispectral images.
It employs undecimated transforms like Atrous DWT with cubic-spline filters to extract and inject high-frequency details while minimizing artifacts.
Recent advancements incorporate deep learning frameworks to enhance modulation gain and noise suppression, improving performance in remote sensing and super-resolution applications.

Wavelet-Based Multi-Band Fusion (LIWT) refers to a class of approaches that exploit multiresolution wavelet transforms for fusing distinct frequency components from multiple input images—most typically, high-resolution panchromatic and lower-resolution multispectral images. By leveraging the capacity of wavelets to decompose signals into orientation- and scale-resolved subbands, LIWT seeks to preserve both spatial detail and spectral fidelity in the fused result. Numerous variants exist, from classical undecimated wavelet modulation schemes in remote sensing to deep-learning-based frameworks and implicit neural architectures. This article provides a rigorous exposition of LIWT methods, their mathematical formulations, algorithmic workflows, and comparative performance.

1. Foundations of Wavelet-Based Multi-Band Fusion

At the core of LIWT are multiscale wavelet analyses, such as the undecimated discrete wavelet transform (UDWT/Atrous DWT), classical Mallat orthogonal DWTs, or stationary Haar transforms. A signal $I(x,y)$ is decomposed at level $J$ into a coarse approximation $I_A^J$ and sets of detail subbands $\{I_d^j\}_{j=1..J,\,d\in\{H,V,D\}}$ capturing oriented high-frequency content at increasing scales. In image fusion, these operations are performed separately on each image (e.g., panchromatic and multispectral bands), with recombination strategies designed to integrate spatial and spectral information optimally (Shahdoosti, 2017, Liu et al., 2020, Li et al., 2018, Bhowmik et al., 2010, Duan et al., 2024).

Key wavelet transforms include:

Atrous (à trous) undecimated DWT with B-spline or cubic-spline scaling, providing translation-invariance without downsampling, crucial for avoiding artifacts in pan-sharpening (Shahdoosti, 2017, Shahdoosti, 2017).
Daubechies, Haar, and other orthogonal wavelets for multi-level energy separation and efficient subband selection (Bhowmik et al., 2010, Duan et al., 2024).

2. Classical LIWT Pan-Sharpening: High-Pass Modulation via Wavelets

The archetypal LIWT fusion approach replaces the traditional high-pass modulation filter (boxcar) with an undecimated wavelet transform to inject spatial detail from the high-resolution panchromatic image $P(x,y)$ into each multispectral band $M_i(x,y)$ (Shahdoosti, 2017). The process is as follows:

DWT Decomposition: Apply J-level Atrous DWT to both $P$ and $M_i$ , yielding approximation and detail subbands:

$P(x,y) = P_A^J(x,y) + \sum_{j=1}^J\sum_{d\in\{H,V,D\}}P_{d}^j(x,y)$

Gain Computation: Compute pixel-wise modulation coefficients:

$\alpha_i(x,y) = \frac{M_i(x,y)}{M_{A,i}^J(x,y)+\epsilon}$

Detail Injection: Form the fused band:

$J$ 0

Reconstruction: Collect all $J$ 1 for the final multispectral cube.

Crucially, the use of undecimated cubic-spline filters avoids the ringing and aliasing associated with boxcar high-pass extraction. The modulation gain ensures radiometric consistency and adaptive spatial injection at each pixel (Shahdoosti, 2017).

3. Algorithmic Workflows and Implementation

Implementation of LIWT methods follows an explicit mathematical structure and stepwise algorithmic pipeline. The following pseudocode encapsulates the canonical LIWT procedure (Shahdoosti, 2017):

$J$ 3

Number of levels: $J$ 2 provides optimal trade-off between spatial injection and preservation of spectral information; higher J may over-emphasize noise while lower J may under-inject detail.
Dynamic-range adjustment: Optional histogram or radiometric matching on output (Shahdoosti, 2017).

Atrous DWT with cubic-spline filters minimizes frequency-domain ripple, reduces artifacts, and maintains the spatial-spectral balance necessary for quantitative and visual fidelity.

Recent developments have adapted the LIWT paradigm to deep neural architectures, multimodal fusion, and arbitrary-scale super-resolution, retaining the fundamental principle of subband-wise multi-band data integration.

WaveFuse (Liu et al., 2020): Integrates multi-level DWT with deep encoders, fusing subbands based on regional energy. Approximation (low-frequency) coefficients are fused via neighborhood energy-weighted averaging; detail (high-frequency) coefficients by maximum-energy selection. Fused features are decoded by a channel-shared decoder.
WaveMamba (Zhu et al., 24 Jul 2025): For RGB-infrared fusion, DWT is applied to intermediate features from two sources, with low-frequency fusion via channel swapping and gated Mamba-based attention, and high-frequency fusion by absolute-max selection. The use of IDWT in the detection head maintains spatial resolution for downstream detection.
Local Implicit Wavelet Transformer (LIWT) (Duan et al., 2024): In super-resolution, wavelet-based decomposition of encoder features produces high-freq priors via a Wavelet-Enhanced Residual Module (WERM), which are fused with local implicit attention and projective feature fusion. This pipeline is explicitly designed to reconstruct high-frequency details for arbitrary scale factors.

These methods reinforce the generality of the LIWT concept—wavelet-domain splitting for frequency-resolved feature integration, with the fusion operation tailored to the modality and downstream task.

5. Comparative Results and Evaluation Metrics

Experimental assessments across classical and deep LIWT methods consistently demonstrate substantial gains in both spectral and spatial fidelity versus baseline fusion strategies.

Table: Performance metrics for classical LIWT pan-sharpening (Shahdoosti, 2017)

Method	Corr. Coefficient	Mutual Info	QNR
LIWT	0.97	0.54	0.95
Boxcar HP	0.88	0.31	0.81
Wavelet (no gain)	0.89	0.36	0.86
Brovey	0.81	0.32	0.75
PCA	0.82	0.33	0.78
IHS	0.82	0.33	0.77

LIWT achieves maximal correlation with reference multispectral images and highest mutual information, reflecting preservation of spectral signatures alongside spatial enhancement. The no-reference QNR metric is also maximized.

In camouflaged foreground detection, a stationary-wavelet-based LIWT algorithm achieves F-measure = 0.87 (recall 0.91, precision 0.84), with robust gains over state-of-the-art baselines (Li et al., 2018).

6. Practical Considerations and Limitations

Undecimated (shift-invariant) DWT is preferred for image fusion to prevent spatial misalignment and suppress reconstruction artifacts (Shahdoosti, 2017, Li et al., 2018).
Wavelet kernel selection is critical; cubic-spline (B3) and Haar wavelets are commonly used. The choice impacts frequency localization and energy leakage.
Modulation gain normalization stabilizes local contrasts but can be sensitive to low denominator values; appropriate ε regularization must be used.
Detail injection versus noise: Excessive levels or strong gain factors may introduce high-frequency noise; parameter selection should be dataset-adaptive.
Computational efficiency: Non-decimated transforms and per-pixel operations increase runtime and memory demand, but parallel/CUDA implementations can achieve practical speeds (Li et al., 2018).

Potential limitations arise when statistical characteristics of the data differ substantially from those of the wavelet basis, warranting adaptation or learning of the analysis-synthesis filters (as in learnable LIWT/NN models) (Duan et al., 2024, Zhu et al., 24 Jul 2025).

7. Applications and Impact

LIWT methods are foundational in multispectral remote sensing, notably for pan-sharpening with high spatial and spectral fidelity; they are applied in face recognition via visual-thermal fusion (Bhowmik et al., 2010), detection of camouflaged foregrounds (Li et al., 2018), multi-modal deep object detection (Zhu et al., 24 Jul 2025), and arbitrary-scale image super-resolution (Duan et al., 2024). Across these domains, the multiresolution, directionally-sensitive fusion conferred by LIWT offers a principled mechanism for detail transfer and robust feature integration, establishing its utility as a canonical image fusion framework.