Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Single Image HDR Reconstruction Using a CNN with Masked Features and Perceptual Loss (2005.07335v1)

Published 15 May 2020 in eess.IV and cs.GR

Abstract: Digital cameras can only capture a limited range of real-world scenes' luminance, producing images with saturated pixels. Existing single image high dynamic range (HDR) reconstruction methods attempt to expand the range of luminance, but are not able to hallucinate plausible textures, producing results with artifacts in the saturated areas. In this paper, we present a novel learning-based approach to reconstruct an HDR image by recovering the saturated pixels of an input LDR image in a visually pleasing way. Previous deep learning-based methods apply the same convolutional filters on well-exposed and saturated pixels, creating ambiguity during training and leading to checkerboard and halo artifacts. To overcome this problem, we propose a feature masking mechanism that reduces the contribution of the features from the saturated areas. Moreover, we adapt the VGG-based perceptual loss function to our application to be able to synthesize visually pleasing textures. Since the number of HDR images for training is limited, we propose to train our system in two stages. Specifically, we first train our system on a large number of images for image inpainting task and then fine-tune it on HDR reconstruction. Since most of the HDR examples contain smooth regions that are simple to reconstruct, we propose a sampling strategy to select challenging training patches during the HDR fine-tuning stage. We demonstrate through experimental results that our approach can reconstruct visually pleasing HDR results, better than the current state of the art on a wide range of scenes.

Citations (139)

Summary

  • The paper introduces a feature masking mechanism that reduces halo artifacts by focusing on well-exposed pixels during training.
  • It employs VGG-based perceptual loss and a two-stage training process to enhance texture synthesis and overcome HDR dataset scarcity.
  • Experimental results demonstrate superior performance with lower MSE and higher HDR-VDP-2 scores, enabling practical HDR imaging in consumer devices.

Single Image HDR Reconstruction Using a CNN with Masked Features and Perceptual Loss

This paper introduces an advanced methodology for High Dynamic Range (HDR) image reconstruction from a single Low Dynamic Range (LDR) image, leveraging Convolutional Neural Networks (CNNs) enhanced with a feature masking approach and perceptual loss. Traditional digital cameras have a limited luminance range, often leading to saturated pixels in high-contrast scenes. Standard methods for HDR reconstruction typically require multiple images at varied exposures, which can be cumbersome and impractical in dynamic environments. This research innovates in removing such hurdles by focusing on single image HDR reconstruction.

Key Contributions

  1. Feature Masking Mechanism: The authors address the checking and halo artifacts prevalent in conventional deep learning approaches by implementing a feature masking strategy. This method discounts the unreliable features originating from the saturated pixel regions in the training phase. This strategy helps the model to focus on well-exposed pixels, reducing the network's ambiguity during training and leading to cleaner, artifact-free HDR outputs.
  2. Adaptation of Perceptual Loss: Integrating components from the VGG-based perceptual loss, the paper enhances the HDR reconstruction process's granularity, enabling the synthesis of plausible textures in saturated areas. Perceptual loss encourages the network to reconstruct images having feature representations similar to that of the ground truth, providing more realistic texture generation.
  3. Two-stage Training: The authors demonstrate an innovative two-stage training regimen. Initially, the model is pre-trained on an image inpainting task to develop a strong internal representation. It is subsequently fine-tuned specifically for HDR reconstruction. The inpainting pre-training allows the CNN to learn how to deal with missing content, reducing the network's dependence on a scarce HDR training dataset.
  4. Patch Sampling Strategy: A targeted patch sampling strategy prioritizes challenging image patches containing complex textures for the network to focus during fine-tuning. This method enhances the network's ability to reconstruct fine details essential for producing high-quality HDR images.

Experimental Results

The approach presented by the authors shows superior performance over existing state-of-the-art methods in both synthetic and real-world settings. The quantitative analysis, in terms of mean squared error (MSE) and the HDR-VDP-2 metric, underscores the capability of this method to accurately predict the full luminance range. Qualitatively, the reconstructions exhibit less blurriness and more accurately reflect real-world textures compared to competing methods like those of Endo et al. and Eilertsen et al.

Implications and Future Prospects

This research significantly advances practical HDR imaging solutions suitable for integration into consumer digital cameras and smartphones. The ability to perform HDR reconstruction from a single image can revolutionize HDR photo capture by reducing the reliance on complex multi-exposure processes. Moreover, the application of perceptual loss in this context opens new directions for implementing similar strategies across other image synthesis tasks.

For future developments, integrating temporal regularization techniques could stabilize HDR sequences in video formats, addressing temporal consistency issues seen in current implementations. Additionally, refining network architectures to optimize computational performance will be pivotal in extending these methods to resource-constrained environments.

In conclusion, the paper presents a robust framework for single-image HDR reconstruction, offering substantial improvements over traditional and contemporary HDR imaging techniques. Through innovative use of CNN architectures and training strategies, this work lays foundational tools for advancing digital imaging capabilities.

Youtube Logo Streamline Icon: https://streamlinehq.com