Efficient Real-World Deblurring
- The paper presents innovative architectures and training protocols that achieve high restoration fidelity, with metrics like 31.13 dB PSNR under stringent compute constraints.
- It details diverse methodologies, including U-Net enhancements, Transformer pruning, and state-space models, to balance precision and efficiency in image and video deblurring.
- Practical implementations emphasize rigorous dataset synthesis, on-device quantization, and hardware-aware evaluation, paving the way for real-time mobile applications.
Efficient real-world deblurring challenge addresses the development, evaluation, and deployment of algorithms for high-fidelity restoration of motion-blurred images or videos under stringent resource constraints, typically for mobile or real-time applications. This encompasses single-image, local-blur, video, event-based, and joint restoration modalities, and is driven by authentic datasets capturing the diversity and complexity of real-world capture: uncontrolled motion, sensor noise, optical artifacts, and device-specific processes.
1. Benchmark Datasets and Experimental Protocols
Efficient real-world deblurring research relies on rigorously constructed datasets that faithfully represent the statistical and physical properties of real blur. Contemporary benchmarks include:
- RSBlur/AIM 2025 test set: Dual-camera, beam-splitter setup captures time-synchronized sharp/blur pairs under real motion and ISP effects; 420 new pairs for testing (Feijoo et al., 14 Oct 2025).
- SloMoDeblur: Smartphone slow-motion acquisition yields 42,045 blur–sharp pairs across 843 scenes at 1920×1080, with realistic exposure accumulation and sensor noise, enabling large-scale generalization studies (Noki et al., 24 Jun 2025).
- REDs-ME/REDS-RE: For video, exposure- and motion-coupled synthetic-to-real mapping with multi-exposure/interpolated sequences in REDS-ME, and random-exposure sequences in REDS-RE designed for temporal consistency testing (Youk et al., 4 Dec 2025).
- DavisMCR: Real event-based blur datasets using DVS (Dynamic Vision Sensor) under variable brightness and contrast (Shen et al., 30 Jul 2024).
- ReLoBlur: High-res (e.g., 2152×1436) local-motion blur annotations for sharp-background/distorted-foreground scenarios (Li et al., 2023).
Evaluation metrics standardize quality, perceptual fidelity, and hardware efficiency:
- PSNR, SSIM, LPIPS for restoration accuracy.
- MACs, runtime, memory footprint for compute/resource constraints (cf. P < 5 M, MACs < 200 GMACs for the AIM 2025 challenge (Feijoo et al., 14 Oct 2025)).
- Warping for residual alignment where GT reference and restored image present pixel shifts.
2. Architectural Strategies for Efficiency
A diverse set of architectures has emerged to meet the dual demand for precise restoration and hardware efficiency:
- U-Net-based, Channel/Spatial Attention: NAFNet and variants provide baseline efficient pipelines, whose modifications (local channel attention, reparameterized convolutions, EMA weight smoothing) drive leading results under strict parameter and MAC budgets (Feijoo et al., 14 Oct 2025).
- Transformer-based designs: Efficient Restormer (Akmaral et al., 30 Jan 2025), RestormerL, and LMD-ViT (Li et al., 2023) deploy cross-channel attention and adaptive window pruning. AdaWPT in LMD-ViT exploits per-patch blur confidence, prunes Transformer windows, and leverages Gumbel-Softmax for differentiable selection—yielding a 66% FLOPs reduction and 2× speedup over non-pruned counterparts while maintaining or enhancing PSNR (35.42 dB on ReLoBlur).
- State-Space Models (SSM): ALGNet (Gao et al., 29 Mar 2024) combines linear-time SSM branching for global context modeling and simplified channel attention for local detail. Feature aggregation via an elementwise gating achieves up to 25× lower FLOPs than self-attention methods and competitive restoration accuracy.
- Stacked Multi-Patch CNNs: DMPHN (Zhang et al., 2019) organizes convolutional encoders/decoders in fine-to-coarse patch pyramids, with optional stacking for flexible runtime/quality trade-off, achieving 30 ms/image at 720p and real-time 30 fps operation.
- Recurrent and Residual Networks: ESTRNN (Zhong et al., 2021) for video uses RDB cells fused by global spatio-temporal attention, balancing hierarchical spatial encoding and neighbor frame fusion; real-time video deblurring at <1 M parameters.
- Two-Stage/Decoupled Designs: FMA-Net++ (Youk et al., 4 Dec 2025) splits degradation modeling (motion/exposure-aware dynamic filtering) and restoration, enabling parallel temporal propagation. RDNet (Shen et al., 30 Jul 2024) restores degraded DVS events prior to event-guided deblurring, showing dominant quality across synthetic and real benchmarks.
3. Specialized Problem Formulations and Algorithmic Innovations
Researchers have introduced innovative formulations to further drive efficiency and robustness:
- Blur Pixel Discretization: SegDeblur (Kim et al., 18 Apr 2024) breaks deblurring into (i) per-pixel blur class segmentation via log-Fourier kernels, (ii) discrete-to-continuous regression conditional on class map. Empirically, 8–16 classes suffice; this approach delivers SOTA or near-SOTA PSNR/SSIM (32.53 dB/0.927 on RealBlur-J) at up to 10× lower MACs than analogous regression models.
- Idempotent Networks: DIN (Mao et al., 2022) incorporates an explicit idempotent constraint, ensuring that repeated passes do not degrade output quality. Combined with a compact recurrent U-Net (~3.1 M params), DIN matches or surpasses larger networks in real-world benchmarks (31.92 dB/0.953 on GoPro) at 35 FPS.
- Event-based Restoration: RDNet (Shen et al., 30 Jul 2024) models three DVS degradation sources: threshold bias, limited bandwidth, and circuit noise. A simulation framework produces paired clean-degraded events, and dual-branch encoders fuse them into UNet pipelines. On DavisMCR and REBlur, RDNet scores up to +1.72 dB over the previous best.
4. Training, Data Synthesis, and Robustness to Domain Shift
Accuracy and robustness depend critically on matching synthetic training statistics to real acquisition conditions:
- RAW-Domain Blur Synthesis: Stage-wise signal processing (frame interpolation, inverse ISP, RAW accumulation, noise/quantization, forward ISP reconstruction) reduces domain gap and improves generalization by 0.5–1.0 dB PSNR across architectures (Wei et al., 2022).
- Data-centric augmentation: Physics-aware forward models (e.g., lens distortion, kernel estimation, object motion) enable scalable on-the-fly synthetic pairs in low-data regimes. Synthetic text/image streams drive “extreme deblurring” pipelines for OCR-critical tasks (Trippe et al., 2022).
- Multi-stage training schedules and targeted loss functions (e.g., progressive patch sizing, frequency loss, structural reparameterization) yield measurable accuracy/robustness gains against real-world artifacts (Feijoo et al., 14 Oct 2025, Akmaral et al., 30 Jan 2025).
5. Hardware Constraints, Quantization, and Practical Deployment
Efficient real-world deblurring is governed by explicit compute, memory, and inference rate constraints to enable embedded or mobile deployment:
- Parameter and MAC budgets: For AIM 2025 challenge, all top algorithms meet P < 5 M, MACs < 200 GMACs, with 31.13 dB PSNR (NAFRepLocal) at 128.7 ms on full-resolution images (Feijoo et al., 14 Oct 2025).
- Quantization strategies: Conversion to 8-bit weights halves memory and speeds up inference, with marginal accuracy loss (<0.2 dB).
- Pruning and cascade: Transformer window pruning (LMD-ViT) and region-wise cascades focus computation only where needed.
- On-device evaluation: SegDeblur-S+ matches commercial solutions (Google Unblur, Samsung EnhanceX) on device latency, but at lower GMACs (Kim et al., 18 Apr 2024).
6. Ablations, Comparative Analyses, and Future Directions
Systematic ablations isolate architectural and procedural factors:
- Trade-offs: increased attention heads can offset fewer blocks (Efficient Restormer); multi-stage schedules or local attention modifications yield +0.9 dB gains at constant overhead (Akmaral et al., 30 Jan 2025, Feijoo et al., 14 Oct 2025).
- SloMoDeblur reveals 2–3 dB degradation vs. synthetic-only benchmarks, confirming the need for real blur diversity (Noki et al., 24 Jun 2025).
- Event and motion/exposure decoupling (FMA-Net++) robustly generalizes across synthetic, random-exposure, and real smartphone sequences (Youk et al., 4 Dec 2025).
- Adaptive degradation modeling (e.g., learnable threshold bias, bandwidth) is recommended for event-based vision (Shen et al., 30 Jul 2024).
Roadmaps for the field include dynamic-inference, meta-data fusion (EXIF, gyro), on-device memory/latency constraints, and perception-aligned metric development. Expansions to multi-device, nighttime/low-light, and high-res local-blur datasets are advocated to facilitate continued generalizability and deployment.
7. Summary Table of Efficiency-Oriented Methods
| Method / Paper | Core Innovation | Params / Compute | Notable Results |
|---|---|---|---|
| NAFRepLocal (Feijoo et al., 14 Oct 2025) | Local/global ch. attention | 4.76 M / 198GMACs | 31.13 dB PSNR |
| Efficient Restormer (Akmaral et al., 30 Jan 2025) | Pruning, freq loss, aug. | 21.2 M / 147GFLOPs | Maintains PSNR |
| LMD-ViT (Li et al., 2023) | AdaWPT window pruning | 54.5 M / 1.48TFLOPs | 35.42 dB PSNR |
| ALGNet (Gao et al., 29 Mar 2024) | SSM local/global fusion | 3.85 M / 17GFLOPs | 33.49 dB PSNR |
| SegDeblur-S (Kim et al., 18 Apr 2024) | Blur-class discretization | 12.3 M / 14.44GMACs | 32.53 dB PSNR |
| DIN (Mao et al., 2022) | Idempotent recurrent U-Net | 3.11 M / 0.028 s | 31.92 dB PSNR |
| DMPHN (Zhang et al., 2019) | Stacked multi-patch CNN | 6.8–28.9 M / 30ms | 31.20 dB PSNR |
Efficient real-world deblurring thus synthesizes constrained neural architectures, data-centric training, sensor/model co-design, and adaptive resource management into a unified framework for restoration under practical deployment scenarios. The evolving landscape encompasses innovations at the levels of algorithmic structure, loss formulation, task decoupling, and cross-domain benchmarking, with the goal of bridging the gap between academic quality and on-device feasibility.