Lightweight Deblurring Block (LD)

Updated 31 December 2025

Lightweight Deblurring Blocks (LD) are parameter-efficient modules that utilize bottleneck convolutions, residual paths, and multi-scale feature extraction to restore sharp image details.
They incorporate explicit mathematical constraints and multi-branch designs to achieve kernel-level inversion and robust deblurring performance with minimal computational cost.
Integration of LD blocks into frameworks enables real-time, low-latency image restoration applications in autonomous vehicles, mobile vision, and embedded systems.

A Lightweight Deblurring Block (LD) is a compact, parameter-efficient module designed to perform high-quality image deblurring with markedly lower computational cost and memory footprint than conventional deep deblurring networks. LD blocks have become central components in recent accelerator-oriented deblurring architectures, as well as in networks pursuing theoretical tractability, interpretability, and/or kernel-level inversion. LD modules typically combine bottleneck convolutional designs, residual or direct inverse-mapping paths, multi-scale feature extraction, and optionally domain-specific constraints to address the broad spectrum of blur phenomena encountered in natural and dynamic scenes.

1. Core Architectural Principles and Variants

LD blocks are instantiated with varied architectural choices to target distinct speed–accuracy–generalization trade-offs. Major LD types described in leading works include:

Deep Linear Network Inverse Block (DRK/L-CNN): The D³ framework (Saraswathula et al., 2024) formulates LD as a purely linear convolutional cascade learning explicit inverse kernels for arbitrary anisotropic Gaussian blur, formulated as a Deep Restoration Kernel (DRK) (11×11 filter), whose coefficients are learned to satisfy a strict identity constraint against a bank of random blur kernels. The associated L-CNN uses 5 layers of 3×3 stride-1 linear convs (32 feature maps, 0.0028M parameters).
Dilated-Inception Bottleneck Block (RFB-s): The SharpGAN generator (Feng et al., 2020) employs LD as “mini–Receptive Field Block” RFB-s modules, each composed of five parallel branches (shortcut plus four dilated/inception pathways), aggregating multi-scale, directionally-sensitive features without resolution loss. Each block utilizes 1×1 bottleneck reductions, dilated convolutions with $r=\{1,3,5\}$ , and channel concatenation coupled with skip connections.
Depthwise-Pointwise with Edge-Normalization Block: RT-Focuser (Wu et al., 26 Dec 2025) introduces LD as a depthwise 3×3, pointwise expansion–compression (C→4C→C), batchnorm+GELU sequence, and residual-to-input summation, optionally enhanced by an edge-preserving sharpness normalization (SN) via fixed Laplacian filter on the input.
Compact U-Net Encoder–Decoder Block: The Deep Idempotent Network (Mao et al., 2022) leverages a lightweight encoder–decoder LD unit (U-Net style) with stacked residual blocks (10 in total) and two resolution-level skip-concats. All convolutions use stride-1 or stride-2 for up/downsampling; no normalization layers are present.

A distinguishing property of LDs is that they perform powerful feature extraction or direct inverse filtering using orders of magnitude fewer parameters than conventional deep modules (often <0.003M per block).

2. Mathematical Formulation and Constraints

Typical LD design includes explicit mathematical constraints, nonlinearities, and residual logic:

Inverse Kernel Learning via Identity Constraint: In D³, LD minimizes

$\mathcal{L}(\Theta) = \| K * f_\Theta(K) - \delta \|_2^2 + R$

subject to Fourier-domain regularizers: area constraint, zero-phase, and unit-magnitude—all promoting physical validity and robust inversion (Saraswathula et al., 2024).

Multi-Branch Aggregation: In SharpGAN RFB-s,

$Y = \text{ReLU}\left(\mathcal{P}'_{1\times1}([F_1,F_2,F_3,F_4]) + \mathcal{P}_{\mathrm{skip}}(X)\right)$

where $F_i$ are the outputs of dilated-conv branches after bottleneck projection (Feng et al., 2020).

Depthwise-Pointwise Pipeline: In RT-Focuser,

$Z = X + D + S$

with $D$ the compressed features and $S = \lambda \cdot (X * K_{\text{Lap}})$ the Laplacian edge-enhancement (Wu et al., 26 Dec 2025).

Progressive Residual U-Net: In Deep Idempotent Network,

$I^{i} = I^{i-1} + \mathrm{LD}(I^{i-1}; \Theta)$

with repeated application leading to enforced idempotence, $D(D(x)) \approx D(x)$ , via

$L_\text{idem} = \|I_1 - I_2\|_1$

(Mao et al., 2022).

3. Integration into Deblurring Frameworks

LD blocks are typically deployed in three types of global frameworks:

System	LD Role	Integration Mode
D³ (DRK/L-CNN)	Learn/encode $K^{-1}$ ; direct conv	Standalone kernel convolution or linear CNN
SharpGAN	Multi-scale feature extraction	9× stacked blocks in GAN generator
RT-Focuser	Edge-aware encoding	Encoder stem; skip to MLIA/X-Fuse
Deep Idempotent	Progressive residual correction	Recurrent residual stack

In D³, the LD is both the network and the explicit operator; in SharpGAN, LDs form the main body of the generator. RT-Focuser uses multiple LD stages per encoder resolution for edge enhancement and parameter efficiency. The Deep Idempotent Network iterates the LD block over the image, using weights shared across steps and enforces stable output through idempotence.

4. Computational Complexity, Parameter Counts, and Performance

LD blocks achieve substantial reductions in both parameter count and runtime relative to non-lightweight architectures.

D³: DRK: 121 parameters (11×11 filter), 0.0005 s inference (1 MP image); L-CNN: ≈2,800 weights, 0.002 s. Traditional baselines: 6–20M parameters, 1–3s runtime (Saraswathula et al., 2024).
SharpGAN: Each LD block ≈316,800 parameters (0.3M), nine blocks ≈2.7×10¹⁰ FLOPs for 64×64 features; end-to-end 0.17 s for 1280×720 images (Feng et al., 2020).
RT-Focuser: Each LD block ≈8,544 parameters for C=32; total network 5.85M params, 15.76 GMACs, 6 ms/frame, >140 FPS on GPU/mobile (Wu et al., 26 Dec 2025).
Deep Idempotent Network: LD block and recurrence stack ≈3.11M params, 0.028 s/image (1280×720), 36 FPS; 6.5× smaller and 6.4× faster than MPRNet (Mao et al., 2022).

A plausible implication is that LD block designs enable real-time/edge AI deployment of deblurring networks formerly restricted to offline or server contexts.

5. Effectiveness and Ablations

Empirical gains and ablation studies confirm the critical role of LD components:

D³ DRK: Best PSNR/SSIM among blind/self-supervised deblurring models on synthetic Gaussian-blurred DIV2K (DRK: 28.02 dB / 0.8383) (Saraswathula et al., 2024).
RT-Focuser: Turning off the SN edge branch in LD decreases PSNR by ≈0.2 dB; halving LD block count in encoder reduces PSNR by ≈0.3 dB, with ~40% reduction in parameters (Wu et al., 26 Dec 2025).
Idempotent Network: Imposing the idempotent constraint gives +0.12 dB PSNR, raising SSIM from 0.949 to 0.953; feature-map and latent-code recurrence via LD adds further gains (Mao et al., 2022).
SharpGAN: LD-based generator gives higher throughput (2× DeblurGAN-v2) and maintains state-of-the-art deblurring metrics on dynamic motion-blur datasets (Feng et al., 2020).

This suggests that the LD block is not only a parameter-reduction device but also structurally important for extracting and restoring sharp details, especially under blur diversity and resource constraints.

6. Design Trends and Implementation Guidance

Recent LD designs reflect key trends:

Emphasis on edge-aware preprocessing: depthwise/pointwise separations and fixed high-pass filters (Laplacian) for explicit edge sharpening.
Multi-branch, multi-dilation architectures to simulate wide receptive fields without pooling/downsampling.
Explicit linearity and invertibility in kernel-driven designs for interpretable behavior.
Progressive and idempotent supervision to stabilize multi-step restoration and prevent over-correction.
Absence of batch normalization in some lightweight, convolutional LD blocks to minimize inference cost and avoid normalization shifts.

To implement an LD block, canonical operations include depthwise spatial convolution (3×3), pointwise 1×1 expansion/compression, edge-normalization branches (e.g., Laplacian-based), residual summing, and optional skip concatenations. For kernel-based LDs, the output is a learned filter, directly convolved in a single operation for restoration.

7. Impact and Applications

LD blocks are pivotal in low-latency image enhancement for autonomous vehicles, mobile vision, real-time video analytics, and embedded vision systems. DRK/LD enables on-device inference for blind deblurring, avoiding explicit blur kernel estimation and maintaining minimal resource budget. GAN-based approaches with RFB-s blocks attain strong perceptual restoration in dynamic environments. U-Net–style LD units facilitate recurrent or multi-stage restoration with stable outputs.

The evolution of LD blocks is tightly coupled with advances in real-time edge AI, efficient inverse problem solving, and interpretable kernel learning. It is plausible that future work will further hybridize explicit kernel estimation, efficient multi-scale representations, and theoretical invertibility for even broader applicability.