Papers
Topics
Authors
Recent
2000 character limit reached

DeVeiler: Glare & Aberration Restoration Network

Updated 28 November 2025
  • DeVeiler is a physically informed restoration network that removes veiling glare and corrects compound aberrations in compact optical imaging.
  • It employs a U-shaped encoder-decoder with transformer-based bottleneck and specialized modules to estimate latent transmission and glare maps.
  • The network leverages a reversibility constraint and cycle-consistency loss to achieve superior restoration quality and robust performance in complex environments.

DeVeiler is a physically grounded restoration network designed to remove veiling glare and jointly correct compound aberrations in images acquired through compact optical systems, such as single-lens and metalens designs. These systems are subject to significant degradation due to spatially-varying aberration blur and depth-independent veiling glare caused by stray-light scattering from non-ideal surfaces and coatings, particularly in complex environments. DeVeiler operates as the final stage in a three-stage pipeline for aberration and veiling glare removal, utilizing latent transmission and glare maps to inform an inverse modeling process constrained by a reversibility principle. This approach enables the restoration of latent clean images with superior physical fidelity and restoration quality compared to purely blind or cascaded correction methods (Qian et al., 21 Nov 2025).

1. Framework and Objective

DeVeiler is implemented as the third stage in an integrated restoration pipeline comprising VeilGen (Stage 1), DDN (Distilled Degradation Network, Stage 2), and DeVeiler (Stage 3):

  • Stage 1 (VeilGen): Synthesizes paired degraded and clean images (Ine,Ic)(I_{ne}, I_c) and predicts two latent maps—spatially varying transmission (TT) and additive glare (GG). VeilGen utilizes Stable Diffusion priors for unsupervised learning and simulation of realistic optical degradation.
  • Stage 2 (DDN): Distills the generative degradation process into a lightweight forward model that can reapply the learned aberration and veiling glare effects.
  • Stage 3 (DeVeiler): Given only a degraded image IneI_{ne}, DeVeiler predicts the underlying clean image I^c\hat{I}_c by learning an inverse mapping guided by the estimated latent maps (T^,G^)(\hat{T}, \hat{G}) and constrained by the physical reversibility embodied in the DDN model. The principal objective is formalized as:

I^c=DeVeiler(Ine)≃Ic\hat{I}_c = DeVeiler(I_{ne}) \simeq I_c

where IcI_c is the latent clean image absent optical aberration and veiling glare.

2. Network Architecture

DeVeiler uses a U-shaped encoder–decoder topology with skip connections and a transformer-based bottleneck for multi-scale feature processing and high-capacity modeling:

  • Encoder: Three groups of ResBlocks extract features at multiple scales from the input IneI_{ne}.
  • Bottleneck: Stacked Residual Swin Transformer Blocks (RSTBs) provide high-capacity nonlinear mixing of latent representations.
  • Decoder: Three ResBlock groups reconstruct the clean image I^c\hat{I}_c from encoded features.
  • Veiling Glare Encoder (VG-Enc): A compact CNN module processes IneI_{ne} to estimate latent transmission and glare maps (T^,G^)(\hat{T}, \hat{G}).
  • Veiling Glare Compensation Module (VGCM): Feature-wise modulation layers attached after early encoding blocks use the latent maps to compensate for glare in the feature space. For intermediate feature FF, the compensation is:

F′=F⊙f1(T^)+f2(G^)F' = F \odot f_1(\hat{T}) + f_2(\hat{G})

where f1f_1 and f2f_2 are 1×11 \times 1 convolutions, and ⊙\odot denotes element-wise multiplication.

  • Skip Connections & Feature Fusion: Encoder features are passed directly to the decoder via concatenation or addition, maintaining fine spatial detail.

3. Forward Model and Mathematical Formulation

The underlying degradation model for each image patch pp omitting color channels is expressed as:

Inep=(Icp⊗Kp)⋅Tp+GpI_{ne}^p = (I_c^p \otimes K^p) \cdot T^p + G^p

where:

  • KpK^p is the local point spread function (PSF) modeling spatially-varying blur,
  • TpT^p is the local attenuation (transmission) map,
  • GpG^p is the additive glare map.

In vector notation, the compound model is:

Ine=T⊙(Ic⊗K)+GI_{ne} = T \odot (I_c \otimes K) + G

DeVeiler learns the inverse mapping:

I^c=fθ(Ine)\hat{I}_c = f_{\theta}(I_{ne})

ensuring that, given estimated maps (T^,G^)(\hat{T}, \hat{G}) and the DDN model,

DDN(I^c,c^vg)≃IneDDN(\hat{I}_c, \hat{c}_{vg}) \simeq I_{ne}

where c^vg\hat{c}_{vg} denotes (T^,G^)(\hat{T}, \hat{G}).

4. Reversibility Constraint and Loss Functions

A key aspect of DeVeiler is its reversibility constraint, enforcing cycle-consistency between restoration and degradation via DDN. The reversibility loss is:

Lrev=∥DDN(I^c,c^vg)−Ine∥1\mathcal{L}_{rev} = \|DDN(\hat{I}_c, \hat{c}_{vg}) - I_{ne}\|_1

Training optimizes a combined objective:

Ltotal=Lrec+λrev Lrev\mathcal{L}_{total} = \mathcal{L}_{rec} + \lambda_{rev}\,\mathcal{L}_{rev}

where:

  • Reconstruction loss:

Lrec=∥I^c−Ic∥1+λperc Lperc\mathcal{L}_{rec} = \|\hat{I}_c - I_c\|_1 + \lambda_{perc} \,\mathcal{L}_{perc}

with

Lperc=∑ℓ∥ϕℓ(I^c)−ϕℓ(Ic)∥22\mathcal{L}_{perc} = \sum_\ell \|\phi_\ell(\hat{I}_c) - \phi_\ell(I_c)\|_2^2

ϕℓ\phi_\ell denotes VGG feature maps at selected layers.

  • λrev=1.0\lambda_{rev}=1.0, λperc=0.01\lambda_{perc}=0.01.

No adversarial loss is employed in Stage 3. The use of reversibility drives the network to produce physically meaningful latent map estimates and robust restoration.

5. Training Protocol

The training procedure consists of data synthesis, forward model distillation, and staged network optimization:

  • Data Generation (VeilGen + DDN):
    • VeilGen trained on paired aberration-only and unpaired compound data using diffusion losses: p=0.3p=0.3, w=0.85w=0.85, $9$k steps, AdamW with learning rate 1×10−51\times10^{-5}, batch size 16.
    • Using frozen LOTGMP and 10-step sampling, 500 (Ic,Ine)(I_c, I_{ne}) pairs are synthesized from the Flickr2K dataset.
    • DDN distillation: 25k iterations, Adam (2e-4→\rightarrow1e-7 cosine schedule), batch size 8, patch size 256×256256\times256, loss: ∥DDN(Ic,cvg)−VeilGen(Ic,cvg)∥1\|\text{DDN}(I_c, c_{vg}) - \text{VeilGen}(I_c, c_{vg})\|_1.
  • DeVeiler Optimization:
    • Phase 1 (pre-train): 170/125 source pairs, 100k iterations, Adam (2e-4→\rightarrow1e-7), batch size 8, patch size 256, random flips.
    • Phase 2 (fine-tune): hybrid of 500 synthetic and source pairs, 5k iterations, Adam (5e-5→\rightarrow1e-7), same batch and augmentation.
    • VG-Enc and VGCM components are inserted between encoder and bottleneck.

6. Quantitative and Qualitative Evaluation

DeVeiler demonstrates superior performance across diverse evaluation protocols:

a) Screen-Compound (Full-Reference)

Method SL: PSNR↑ / SSIM↑ / LPIPS↓ MRL: PSNR / SSIM / LPIPS
SwinIR + Flare7K++ 21.67 / 0.723 / 0.297 20.74 / 0.745 / 0.336
QDMR 18.45 / 0.681 / 0.291 20.67 / 0.725 / 0.315
DeVeiler (Ours) 22.38 / 0.729 / 0.261 21.57 / 0.746 / 0.301

b) Realworld-Compound (No-Reference)

Method SL: CLIPIQA↑ / Q-Align↑ / NIQE↓ MRL: CLIPIQA / Q-Align / NIQE
DiffDehaze 0.406 / 3.982 / 6.476 0.428 / 3.476 / 6.802
QDMR 0.405 / 3.864 / 4.773 0.376 / 3.337 / 5.509
DeVeiler (Ours) 0.607 / 3.987 / 4.448 0.440 / 3.586 / 5.296

Qualitatively, DeVeiler recovers enhanced contrast and color fidelity in severe glare regions, and preserves fine textures that are frequently lost or incorrectly restored by cascaded or blind correction schemes.

7. Implementation Details and Significance

  • Architecture: Encoder and decoder each consist of 3 ResBlock groups; bottleneck utilizes 6 RSTB layers.
  • Hyperparameters: λrev=1.0\lambda_{rev}=1.0, λperc=0.01\lambda_{perc}=0.01, Adam optimizer (β1=0.9,β2=0.999)(\beta_1=0.9, \beta_2=0.999).
  • Training time: VeilGen data synthesis ≈ 3 hours; DeVeiler fine-tune ≈ 40 minutes (NVIDIA A100).
  • Inference latency: ≈ 0.39 seconds per 1280×19201280\times1920 image (A100 GPU).

The combination of encoder–decoder architecture, explicit glare modeling via VG-Enc and VGCM, and reversibility-driven learning yields a physically interpretable, data-efficient solution for restoration in demanding optical scenarios. This design outperforms cascaded and blind correction schemes for imaging through compact optics (Qian et al., 21 Nov 2025).

A plausible implication is that physically informed restoration networks, constrained by reversible scattering models and guided by latent transmission and glare maps, offer substantial advantages in fidelity and robustness for real-world imaging applications where compound optical degradations are present.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to DeVeiler.