LatentUnfold: Unified Blind Image Restoration

Updated 22 December 2025

LatentUnfold is a unified framework for blind image restoration that integrates interpretable optimization, nonparametric degradation modeling, and conditional latent diffusion priors.
It employs a multi-stage architecture featuring a Multi-Granularity Degradation-Aware module, a Degradation-Resistant Latent Diffusion Model, and an Over-Smoothing Correction Transformer.
Experimental results across benchmarks like SIDD, GoPro, and LOL-v2 demonstrate state-of-the-art performance in PSNR and SSIM, highlighting its robustness and efficacy.

LatentUnfold (formally UnfoldLDM) is a unified deep unfolding network for blind image restoration (BIR), integrating interpretable optimization principles, nonparametric degradation modeling, and conditional latent diffusion priors within a multi-stage architecture. Developed to address the dual limitations of degradation-specific dependency and over-smoothing bias inherent in classical Deep Unfolding Networks (DUNs), LatentUnfold introduces a three-part pipeline: Multi-Granularity Degradation-Aware modeling, Degradation-Resistant Latent Diffusion Priors, and Over-Smoothing Correction Transformers. This plug-and-play structure achieves state-of-the-art results across a wide spectrum of BIR tasks by jointly estimating the unknown degradation and restoring structured, high-frequency image details (He et al., 22 Nov 2025).

The blind restoration model observes a degraded image $y \in \mathbb{R}^{c \times h \times w}$ generated via an unknown linear process:

$y = D x + n$

where $x$ is the latent clean image, $D$ is an unknown degradation matrix, and $n$ is additive noise. To capture structure and reduce complexity, $D$ is factorized as a Kronecker product:

$D = M^T \otimes W$

with $W \in \mathbb{R}^{c \times h \times h}$ and $M \in \mathbb{R}^{c \times w \times w}$ . The energy minimization for restoration is

$L(x) = \frac{1}{2} \|y - D x\|_2^2 + \frac{1}{2} \|y - W x M\|_2^2 + \lambda \phi(x)$

where $\phi(x)$ is a learned image prior and $\lambda$ weights regularization. The problem is solved via $K$ -stage proximal-gradient unfolding, applying block-coordinate descent over fidelity terms $g(x)$ and $h(x)$ followed by a learned proximal operator. Specifically, at each stage $k$ :

$\hat{x}_k = x_{k-1} - \beta_k \nabla g(x_{k-1})$
$\tilde{x}_k = x_{k-1} - \gamma_k \nabla h(x_{k-1})$
$x_k = \mathrm{prox}_{\lambda,\phi}(\hat{x}_k,\tilde{x}_k)$

2. Multi-Granularity Degradation-Aware (MGDA) Module

MGDA replaces analytic gradients with data-driven surrogates, enabling end-to-end handling of unknown degradations:

Holistic Degradation: Two Siamese Visual State Space (VSS) networks ( $\mathrm{VSS}^D_k$ , $\mathrm{VSS}^{D^T}_k$ ) estimate $D$ and its transpose, producing

$\hat{x}_k = x_{k-1} - \beta_k \mathrm{VSS}^{D^T}_k(\mathrm{VSS}^D_k(x_{k-1}) - y)$

Structured Decomposition: Neural blocks $\mathcal{D}_M$ and $\mathcal{D}_W$ alternately estimate components $M_k$ and $W_k$ , constructed via normalized outputs from concatenated feature maps. The structured fidelity update is

$\tilde{x}_k = x_{k-1} - \gamma_k W_k^T ( W_k x_{k-1} M_k - y ) M_k^T$

An intra-stage consistency loss

$L_\text{ISDA} = \sum_{k=2}^K 2^{-(K-k)} \|\hat{x}_k - \tilde{x}_k\|_1$

promotes alignment between holistic and structural branches.

3. Degradation-Resistant Latent Diffusion Model (DR-LDM)

The proximal operator in LatentUnfold is realized by a conditional latent diffusion model designed for degradation invariance:

Latent Prior Extraction: In Phase I, a Prior Inference (PI) network maps $[\hat{x}_k, \tilde{x}_k, x_{GT}]$ to a compact latent prior $P_k^h$ .
Diffusion Forward: For $t=1,\dots,T$ steps,

$q(P_k^{h,t}|P_k^{h,t-1}) = \mathcal{N}(P_k^{h,t};\sqrt{1-\beta^t} P_k^{h,t-1}, \beta^t I)$

with $\alpha^t = 1-\beta^t$ , $\bar{\alpha}^t = \prod_{i=1}^t \alpha^i$ .

Diffusion Reverse: A denoising network $\epsilon_\theta$ predicts noise given noisy latent prior and a conditioning vector $P_k^c = PI'([\hat{x}_k, \tilde{x}_k])$ . The recursion is:

$P_k^{h,t-1} = \frac{1}{\sqrt{\alpha^t}}\big( P_k^{h,t} - \frac{1-\alpha^t}{\sqrt{1-\bar{\alpha}^t}} \epsilon_\theta(\cdot) \big) + \sqrt{1-\alpha^t}\,\epsilon^t$

After T steps, the sampled prior $\hat P_k^h$ is passed to the detail recovery module.

4. Over-Smoothing Correction Transformer (OCFormer)

OCFormer is a U-shaped network that fuses intermediate results and the diffusion posterior to restore high-frequency textures:

Degradation-Resistant Attention (DRA): Features from $[\hat{x}_k, \tilde{x}_k]$ are enriched by learning self-attention weights through mixed $1 \times 1$ and $3 \times 3$ depthwise convolutions:

$F' = \mathrm{Softmax}((QK^T)/I) V + F$

Prior-Guided Detail Recovery (PDR): The prior $\hat P_k^h$ modulates normalized features:

$F'' = \mathrm{Linear}_1(\hat P_k^h) \odot \mathrm{LN}(F') + \mathrm{Linear}_2(\hat P_k^h)$

$F_x = F' + \mathrm{GELU}(W_G F'') \odot (W_H F'')$

The final output $x_k$ is generated by the U-net decoder.

5. Unified Iterative Restoration Algorithm

The end-to-end unfolding procedure, as summarized in the provided pseudocode, executes $K$ stages. Each stage alternates between MGDA steps to estimate both holistic and structured degradations, then applies DR-LDM to sample a latent prior, and finally invokes OCFormer for refined reconstruction. This approach is designed as plug-and-play; it can be integrated as a wrapper for existing DUN-based methods.

6. Training Procedures and Loss Functions

The training is phased:

Phase I: Pretrain PI and OCFormer with

$L^I_{\text{Total}} = L_{\text{Rec}} + \zeta_1 L_{\text{ISDA}}$

where $L_{\text{Rec}} = \|x_K - x_{GT}\|_1$ .

Phase II: Train DR-LDM and fine-tune the entire framework with

$L^{II}_{\text{Total}} = L_{\text{Rec}} + \zeta_2 L_{\text{ISDA}} + \zeta_3 L_{\text{Diff}}$

$L_{\text{Diff}} = \|\hat P_k^h - P_k^h\|_1$ , $\zeta_1 = \zeta_2 = \zeta_3 = 1$ in practice.

7. Experimental Results and Interpretation

Implemented in PyTorch on NVIDIA H200 (K=3 stages, T=3, $C_p=64$ ), LatentUnfold achieves the following on standard BIR benchmarks:

Blind denoising: SIDD (PSNR 40.02 dB, SSIM 0.961), DND (40.06 dB, 0.958)
Blind deblurring: GoPro (34.32 dB, 0.970), HIDE (31.85 dB, 0.948)
Underwater: UIEB (24.70 dB, 0.947)
Backlit: BAID (24.97 dB, 0.910)
Low-light: LOL-v2 real (23.58 dB, 0.886); synthetic (27.92 dB, 0.957)
Deraining: Five benchmarks, average PSNR ~39.5 dB, SSIM ~0.98

Across all cases, UnfoldLDM establishes new state-of-the-art results for blind restoration (He et al., 22 Nov 2025).

Standard DUNs exhibit a low-frequency bias due to the dominance of smooth components in gradient-driven updates, especially under severe or unknown degradations, leading to oversmoothing. The latent diffusion prior in DR-LDM is explicitly trained for degradation invariance, promoting generative recovery of natural high-frequency textures. Conditioning the diffusion prior on MGDA’s estimates prevents reintroduction of degraded patterns, while the bidirectional interplay—cleaner inputs aiding prior learning, and stronger priors enhancing restoration—enables sharp, artifact-free outputs for a wide variety of degradations.

In summary, LatentUnfold (UnfoldLDM) represents an advance in blind image restoration by combining interpretable model-based unfolding, neural degradation modeling, and strong generative priors, implemented in a modular, extensible framework (He et al., 22 Nov 2025).

PDF Markdown Chat (Pro)

References (1)

UnfoldLDM: Deep Unfolding-based Blind Image Restoration with Latent Diffusion Priors (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to LatentUnfold Framework.

LatentUnfold: Unified Blind Image Restoration

1. Blind Restoration Optimization Formulation

2. Multi-Granularity Degradation-Aware (MGDA) Module

3. Degradation-Resistant Latent Diffusion Model (DR-LDM)

4. Over-Smoothing Correction Transformer (OCFormer)

5. Unified Iterative Restoration Algorithm

6. Training Procedures and Loss Functions

7. Experimental Results and Interpretation

8. Significance of Latent Diffusion Priors in Blind Restoration

Whiteboard

Follow Topic

Continue Learning

LatentUnfold: Unified Blind Image Restoration

1. Blind Restoration Optimization Formulation

2. Multi-Granularity Degradation-Aware (MGDA) Module

3. Degradation-Resistant Latent Diffusion Model (DR-LDM)

4. Over-Smoothing Correction Transformer (OCFormer)

5. Unified Iterative Restoration Algorithm

6. Training Procedures and Loss Functions

7. Experimental Results and Interpretation

8. Significance of Latent Diffusion Priors in Blind Restoration

Sponsor

Whiteboard

Follow Topic

Continue Learning

Related Topics