Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improving Single-Image Defocus Deblurring: How Dual-Pixel Images Help Through Multi-Task Learning (2108.05251v2)

Published 11 Aug 2021 in cs.CV

Abstract: Many camera sensors use a dual-pixel (DP) design that operates as a rudimentary light field providing two sub-aperture views of a scene in a single capture. The DP sensor was developed to improve how cameras perform autofocus. Since the DP sensor's introduction, researchers have found additional uses for the DP data, such as depth estimation, reflection removal, and defocus deblurring. We are interested in the latter task of defocus deblurring. In particular, we propose a single-image deblurring network that incorporates the two sub-aperture views into a multi-task framework. Specifically, we show that jointly learning to predict the two DP views from a single blurry input image improves the network's ability to learn to deblur the image. Our experiments show this multi-task strategy achieves +1dB PSNR improvement over state-of-the-art defocus deblurring methods. In addition, our multi-task framework allows accurate DP-view synthesis (e.g., ~39dB PSNR) from the single input image. These high-quality DP views can be used for other DP-based applications, such as reflection removal. As part of this effort, we have captured a new dataset of 7,059 high-quality images to support our training for the DP-view synthesis task. Our dataset, code, and trained models are publicly available at https://github.com/Abdullah-Abuolaim/multi-task-defocus-deblurring-dual-pixel-nimat.

Citations (28)

Summary

  • The paper introduces a dual-task deep neural network that leverages dual-pixel data to enhance deblurring performance, achieving a PSNR improvement of about 1 dB.
  • The methodology employs a shared encoder with multiple decoders to concurrently perform deblurring and dual-pixel view synthesis, ensuring efficient cross-task learning.
  • The approach not only improves defocus deblurring but also facilitates applications like synthetic depth and reflection removal, promising advances in smartphone imaging.

Improving Single-Image Defocus Deblurring with a Multi-Task Framework Leveraging Dual-Pixel Images

The paper entitled "Improving Single-Image Defocus Deblurring: How Dual-Pixel Images Help Through Multi-Task Learning" by Abuolaim et al. introduces a novel approach to defocus deblurring by exploiting dual-pixel (DP) technology in cameras. This research addresses a challenging problem in computational photography, where the aim is to reduce defocus blur in single captured images by utilizing additional data provided by DP sensors embedded in modern cameras.

Overview of the Approach

Dual-pixel sensors capture two sub-aperture views of a scene, which are typically used to enhance autofocus performance by measuring phase differences. This paper harnesses these two sub-aperture views within a multi-task learning framework. The authors propose a convolutional neural network capable of performing single-image deblurring while simultaneously synthesizing the DP views. The novelty lies in leveraging the latent information obtained from both tasks to optimize performance beyond what state-of-the-art single-task deblurring methods can achieve.

Contributions and Methodology

Abuolaim et al. introduce a dual-task deep neural network (DNN), which incorporates both a deblurring decoder and a DP-view synthesis decoder within a single architecture. This multi-branch strategy allows for cross-task sharing of information that enhances learning capacities. Their approach uses two novel loss functions tailored to the properties of DP image formation to better preserve directional information and reduce blurring effects.

  • Multi-Task Framework: The network employs a single encoder and three decoders for processing input images into deblurred outputs and synthesized DP views, exploiting shared latent spaces for enhanced learning.
  • Dataset and Training: To support the framework, the authors provide a dataset consisting of over 7,000 high-quality images capturing both DP views and corresponding single-image inputs. They organize training into two steps, focusing initially on the synthesis task and then jointly optimizing the entire network.

Results

The quantitative evaluation indicates a PSNR improvement of approximately +1 dB, outperforming existing defocus deblurring methods. The network achieves a PSNR of nearly 39 dB for DP-view synthesis, a result substantiated by extensive experiments involving both deblurring and additional tasks like reflection removal.

Implications and Future Directions

This work demonstrates that multi-task learning not only aids in defocus deblurring but also facilitates applications such as synthetic depth and reflection removal, advancing the utility of DP sensors. The research suggests potential developments in smartphone camera functionalities where DP data access is limited, as the framework effectively synthesizes these views, making real-time applications feasible. Future work could explore expanding this paradigm to other imaging tasks where latent task interdependencies can be similarly exploited. Additionally, investigating the network's applicability across diverse sensor architectures and lighting conditions could broaden its applicability in practical settings.

In essence, this research bridges a crucial gap in defocus deblurring by effectively incorporating dual-pixel data, setting a precedent for subsequent developments in computational photography and vision.

Youtube Logo Streamline Icon: https://streamlinehq.com