Papers
Topics
Authors
Recent
2000 character limit reached

LatentUnfold: Unified Blind Image Restoration

Updated 22 December 2025
  • LatentUnfold is a unified framework for blind image restoration that integrates interpretable optimization, nonparametric degradation modeling, and conditional latent diffusion priors.
  • It employs a multi-stage architecture featuring a Multi-Granularity Degradation-Aware module, a Degradation-Resistant Latent Diffusion Model, and an Over-Smoothing Correction Transformer.
  • Experimental results across benchmarks like SIDD, GoPro, and LOL-v2 demonstrate state-of-the-art performance in PSNR and SSIM, highlighting its robustness and efficacy.

LatentUnfold (formally UnfoldLDM) is a unified deep unfolding network for blind image restoration (BIR), integrating interpretable optimization principles, nonparametric degradation modeling, and conditional latent diffusion priors within a multi-stage architecture. Developed to address the dual limitations of degradation-specific dependency and over-smoothing bias inherent in classical Deep Unfolding Networks (DUNs), LatentUnfold introduces a three-part pipeline: Multi-Granularity Degradation-Aware modeling, Degradation-Resistant Latent Diffusion Priors, and Over-Smoothing Correction Transformers. This plug-and-play structure achieves state-of-the-art results across a wide spectrum of BIR tasks by jointly estimating the unknown degradation and restoring structured, high-frequency image details (He et al., 22 Nov 2025).

1. Blind Restoration Optimization Formulation

The blind restoration model observes a degraded image yRc×h×wy \in \mathbb{R}^{c \times h \times w} generated via an unknown linear process:

y=Dx+ny = D x + n

where xx is the latent clean image, DD is an unknown degradation matrix, and nn is additive noise. To capture structure and reduce complexity, DD is factorized as a Kronecker product:

D=MTWD = M^T \otimes W

with WRc×h×hW \in \mathbb{R}^{c \times h \times h} and MRc×w×wM \in \mathbb{R}^{c \times w \times w}. The energy minimization for restoration is

L(x)=12yDx22+12yWxM22+λϕ(x)L(x) = \frac{1}{2} \|y - D x\|_2^2 + \frac{1}{2} \|y - W x M\|_2^2 + \lambda \phi(x)

where ϕ(x)\phi(x) is a learned image prior and λ\lambda weights regularization. The problem is solved via KK-stage proximal-gradient unfolding, applying block-coordinate descent over fidelity terms g(x)g(x) and h(x)h(x) followed by a learned proximal operator. Specifically, at each stage kk:

  • x^k=xk1βkg(xk1)\hat{x}_k = x_{k-1} - \beta_k \nabla g(x_{k-1})
  • x~k=xk1γkh(xk1)\tilde{x}_k = x_{k-1} - \gamma_k \nabla h(x_{k-1})
  • xk=proxλ,ϕ(x^k,x~k)x_k = \mathrm{prox}_{\lambda,\phi}(\hat{x}_k,\tilde{x}_k)

2. Multi-Granularity Degradation-Aware (MGDA) Module

MGDA replaces analytic gradients with data-driven surrogates, enabling end-to-end handling of unknown degradations:

  • Holistic Degradation: Two Siamese Visual State Space (VSS) networks (VSSkD\mathrm{VSS}^D_k, VSSkDT\mathrm{VSS}^{D^T}_k) estimate DD and its transpose, producing

x^k=xk1βkVSSkDT(VSSkD(xk1)y)\hat{x}_k = x_{k-1} - \beta_k \mathrm{VSS}^{D^T}_k(\mathrm{VSS}^D_k(x_{k-1}) - y)

  • Structured Decomposition: Neural blocks DM\mathcal{D}_M and DW\mathcal{D}_W alternately estimate components MkM_k and WkW_k, constructed via normalized outputs from concatenated feature maps. The structured fidelity update is

x~k=xk1γkWkT(Wkxk1Mky)MkT\tilde{x}_k = x_{k-1} - \gamma_k W_k^T ( W_k x_{k-1} M_k - y ) M_k^T

An intra-stage consistency loss

LISDA=k=2K2(Kk)x^kx~k1L_\text{ISDA} = \sum_{k=2}^K 2^{-(K-k)} \|\hat{x}_k - \tilde{x}_k\|_1

promotes alignment between holistic and structural branches.

3. Degradation-Resistant Latent Diffusion Model (DR-LDM)

The proximal operator in LatentUnfold is realized by a conditional latent diffusion model designed for degradation invariance:

  • Latent Prior Extraction: In Phase I, a Prior Inference (PI) network maps [x^k,x~k,xGT][\hat{x}_k, \tilde{x}_k, x_{GT}] to a compact latent prior PkhP_k^h.
  • Diffusion Forward: For t=1,,Tt=1,\dots,T steps,

q(Pkh,tPkh,t1)=N(Pkh,t;1βtPkh,t1,βtI)q(P_k^{h,t}|P_k^{h,t-1}) = \mathcal{N}(P_k^{h,t};\sqrt{1-\beta^t} P_k^{h,t-1}, \beta^t I)

with αt=1βt\alpha^t = 1-\beta^t, αˉt=i=1tαi\bar{\alpha}^t = \prod_{i=1}^t \alpha^i.

  • Diffusion Reverse: A denoising network ϵθ\epsilon_\theta predicts noise given noisy latent prior and a conditioning vector Pkc=PI([x^k,x~k])P_k^c = PI'([\hat{x}_k, \tilde{x}_k]). The recursion is:

Pkh,t1=1αt(Pkh,t1αt1αˉtϵθ())+1αtϵtP_k^{h,t-1} = \frac{1}{\sqrt{\alpha^t}}\big( P_k^{h,t} - \frac{1-\alpha^t}{\sqrt{1-\bar{\alpha}^t}} \epsilon_\theta(\cdot) \big) + \sqrt{1-\alpha^t}\,\epsilon^t

After T steps, the sampled prior P^kh\hat P_k^h is passed to the detail recovery module.

4. Over-Smoothing Correction Transformer (OCFormer)

OCFormer is a U-shaped network that fuses intermediate results and the diffusion posterior to restore high-frequency textures:

  • Degradation-Resistant Attention (DRA): Features from [x^k,x~k][\hat{x}_k, \tilde{x}_k] are enriched by learning self-attention weights through mixed 1×11 \times 1 and 3×33 \times 3 depthwise convolutions:

F=Softmax((QKT)/I)V+FF' = \mathrm{Softmax}((QK^T)/I) V + F

  • Prior-Guided Detail Recovery (PDR): The prior P^kh\hat P_k^h modulates normalized features:

F=Linear1(P^kh)LN(F)+Linear2(P^kh)F'' = \mathrm{Linear}_1(\hat P_k^h) \odot \mathrm{LN}(F') + \mathrm{Linear}_2(\hat P_k^h)

Fx=F+GELU(WGF)(WHF)F_x = F' + \mathrm{GELU}(W_G F'') \odot (W_H F'')

The final output xkx_k is generated by the U-net decoder.

5. Unified Iterative Restoration Algorithm

The end-to-end unfolding procedure, as summarized in the provided pseudocode, executes KK stages. Each stage alternates between MGDA steps to estimate both holistic and structured degradations, then applies DR-LDM to sample a latent prior, and finally invokes OCFormer for refined reconstruction. This approach is designed as plug-and-play; it can be integrated as a wrapper for existing DUN-based methods.

6. Training Procedures and Loss Functions

The training is phased:

  • Phase I: Pretrain PI and OCFormer with

LTotalI=LRec+ζ1LISDAL^I_{\text{Total}} = L_{\text{Rec}} + \zeta_1 L_{\text{ISDA}}

where LRec=xKxGT1L_{\text{Rec}} = \|x_K - x_{GT}\|_1.

  • Phase II: Train DR-LDM and fine-tune the entire framework with

LTotalII=LRec+ζ2LISDA+ζ3LDiffL^{II}_{\text{Total}} = L_{\text{Rec}} + \zeta_2 L_{\text{ISDA}} + \zeta_3 L_{\text{Diff}}

LDiff=P^khPkh1L_{\text{Diff}} = \|\hat P_k^h - P_k^h\|_1, ζ1=ζ2=ζ3=1\zeta_1 = \zeta_2 = \zeta_3 = 1 in practice.

7. Experimental Results and Interpretation

Implemented in PyTorch on NVIDIA H200 (K=3 stages, T=3, Cp=64C_p=64), LatentUnfold achieves the following on standard BIR benchmarks:

  • Blind denoising: SIDD (PSNR 40.02 dB, SSIM 0.961), DND (40.06 dB, 0.958)
  • Blind deblurring: GoPro (34.32 dB, 0.970), HIDE (31.85 dB, 0.948)
  • Underwater: UIEB (24.70 dB, 0.947)
  • Backlit: BAID (24.97 dB, 0.910)
  • Low-light: LOL-v2 real (23.58 dB, 0.886); synthetic (27.92 dB, 0.957)
  • Deraining: Five benchmarks, average PSNR ~39.5 dB, SSIM ~0.98

Across all cases, UnfoldLDM establishes new state-of-the-art results for blind restoration (He et al., 22 Nov 2025).

8. Significance of Latent Diffusion Priors in Blind Restoration

Standard DUNs exhibit a low-frequency bias due to the dominance of smooth components in gradient-driven updates, especially under severe or unknown degradations, leading to oversmoothing. The latent diffusion prior in DR-LDM is explicitly trained for degradation invariance, promoting generative recovery of natural high-frequency textures. Conditioning the diffusion prior on MGDA’s estimates prevents reintroduction of degraded patterns, while the bidirectional interplay—cleaner inputs aiding prior learning, and stronger priors enhancing restoration—enables sharp, artifact-free outputs for a wide variety of degradations.

In summary, LatentUnfold (UnfoldLDM) represents an advance in blind image restoration by combining interpretable model-based unfolding, neural degradation modeling, and strong generative priors, implemented in a modular, extensible framework (He et al., 22 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to LatentUnfold Framework.