Deep High Dynamic Range Imaging with Large Foreground Motions (1711.08937v3)

Published 24 Nov 2017 in cs.CV and cs.GR

Abstract: This paper proposes the first non-flow-based deep framework for high dynamic range (HDR) imaging of dynamic scenes with large-scale foreground motions. In state-of-the-art deep HDR imaging, input images are first aligned using optical flows before merging, which are still error-prone due to occlusion and large motions. In stark contrast to flow-based methods, we formulate HDR imaging as an image translation problem without optical flows. Moreover, our simple translation network can automatically hallucinate plausible HDR details in the presence of total occlusion, saturation and under-exposure, which are otherwise almost impossible to recover by conventional optimization approaches. Our framework can also be extended for different reference images. We performed extensive qualitative and quantitative comparisons to show that our approach produces excellent results where color artifacts and geometric distortions are significantly reduced compared to existing state-of-the-art methods, and is robust across various inputs, including images without radiometric calibration.

Citations (245)

View on Semantic Scholar

Summary

The paper presents a non-flow-based deep learning method that translates LDR images into HDR outputs despite significant foreground motion.
The approach employs an encoder-decoder architecture with channel-wise concatenation and radiometric corrections to hallucinate missing details.
The method outperforms traditional HDR techniques in efficiency and quality, as shown by improved PSNR, SSIM, and HDR-VDP-2 metrics.

Deep High Dynamic Range Imaging with Large Foreground Motions

In the paper titled "Deep High Dynamic Range Imaging with Large Foreground Motions," the authors present a novel approach to overcoming challenges encountered in high dynamic range (HDR) imaging, specifically focusing on dynamic scenes with significant foreground movements. HDR imaging aims to reproduce a greater dynamic range of luminosity than is possible with standard digital imaging or photographic techniques. Traditional methods often rely on optical flow techniques to align input images before merging, which poses difficulties due to occlusion and extensive motion. This paper introduces a non-flow-based deep learning framework that reformulates the task as an image translation problem.

The authors leverage a deep neural network to translate a sequence of low dynamic range (LDR) images into an HDR image without the explicit estimation of optical flows. This avoidance of optical flow alignment is notable because such alignments can introduce geometric distortions and artifacts, particularly in situations with large foreground motions or variances in exposure levels. The proposed method is robust against these challenges and includes mechanisms to infer or “hallucinate” plausible image details in scenarios of complete occlusion, saturation, and under-exposure.

Key Contributions and Methodology

End-to-End Learning Framework: The authors propose an innovative encoder-decoder architecture to handle HDR imaging, experimenting with two structures—Unet and ResNet. This method includes a channel-wise concatenation of LDR images and their corresponding HDR domain representations as input to the network.
Data Processing: The input LDR images undergo radiometric calibration and gamma correction to prepare them for the neural network. The proposed network is designed to effectively handle missing information in LDR shots, learning to hallucinate details that are absent due to occlusion or saturation.
Background Alignment: While focusing on foreground motions, the authors utilize a simple homography transformation for background alignment to avoid blurring artifacts associated with background motions, reducing the complexity during neural network training.
Flexible Reference Frame: The framework supports different reference images, further demonstrating the adaptability of their approach to various initial conditions of input LDRs.

Experimental Results

The methodology is critically evaluated against several state-of-the-art HDR merging techniques, with comprehensive qualitative and quantitative assessments. Metrics such as PSNR and SSIM are used, and the framework's HDR-VDP-2 score is reported to surpass existing methods. Notably, computational efficiency is a highlight, with this approach being faster than several established techniques, even under CPU conditions. The capability to hallucinate plausible details is significant, addressing limitations in scenarios with large saturation or occlusion.

Implications and Future Directions

The authors articulate a shift in how HDR imaging can be approached, moving from flow-based optical alignment to an end-to-end learning framework that inherently mitigates distortion and artifacts. This presents a practical advancement for dynamic scenes where traditional methods falter. The ability of the network to hallucinate details underscores potential applications in scenes with significant information loss due to motion or lighting variations.

For future work, the authors indicate an interest in enhancing the hallucination processes, potentially incorporating high-level semantics or contextual data to strengthen the recovery of details in extensively saturated areas. This research paves the way for more sophisticated and versatile imaging systems capable of overcoming the limitations of existing HDR methods, with implications for both computational photography and real-time image synthesis.

This contribution to HDR imaging represents a meaningful advance in leveraging deep learning for optical challenges and sets the stage for continued innovation at the intersection of computational photography and neural networks.

PDF Markdown