Single Image Reflection Separation with Perceptual Losses (1806.05376v1)

Published 14 Jun 2018 in cs.CV

Abstract: We present an approach to separating reflection from a single image. The approach uses a fully convolutional network trained end-to-end with losses that exploit low-level and high-level image information. Our loss function includes two perceptual losses: a feature loss from a visual perception network, and an adversarial loss that encodes characteristics of images in the transmission layers. We also propose a novel exclusion loss that enforces pixel-level layer separation. We create a dataset of real-world images with reflection and corresponding ground-truth transmission layers for quantitative evaluation and model training. We validate our method through comprehensive quantitative experiments and show that our approach outperforms state-of-the-art reflection removal methods in PSNR, SSIM, and perceptual user study. We also extend our method to two other image enhancement tasks to demonstrate the generality of our approach.

Citations (297)

View on Semantic Scholar

Summary

The paper presents a novel fully convolutional network that leverages hypercolumn features from VGG-19 to distinguish transmission and reflection layers.
It integrates perceptual, adversarial, and exclusion losses to capture both low-level and high-level image semantics for improved separation results.
Experimental results show enhanced PSNR, SSIM, and subjective quality on synthetic and real-world datasets, highlighting its potential for broader image enhancement tasks.

Single Image Reflection Separation with Perceptual Losses: A Technical Exposition

This paper addresses the complex challenge of separating reflection from a single image, a persistent issue in computer vision. The authors propose a novel approach using a fully convolutional network trained end-to-end with perceptual losses. This method combines low-level and high-level image information to outperform prior state-of-the-art reflection removal techniques.

Methodology

The central contribution of this research is the design and implementation of a novel network architecture for single image reflection separation. The network is fully convolutional and is enhanced by incorporating hypercolumn features from a pre-trained VGG-19 network. The use of hypercolumns enriches the input with multi-level features, leveraging the network's ability to abstract image semantics across scales. The model's architecture resembles a context aggregation network and is constructed to exploit a receptive field of 513x513, which is crucial for global image context.

A key advancement in this work is the introduction of perceptual losses, which include feature loss, adversarial loss, and exclusion loss, each serving a distinct purpose:

Feature Loss: Utilizes a loss computed in a feature space extracted from VGG-19, assisting the network in capturing high-level image semantics essential for distinguishing between transmission and reflection layers.
Adversarial Loss: An adversarial network, akin to a conditional GAN, refines the transmission layer with realism, reducing residual artifacts and improving visual fidelity.
Exclusion Loss: Innovatively proposed, this loss enforces the decorrelation of gradients between transmission and reflection layers, thus enhancing the independence of the two components in the image decomposition.

Dataset and Evaluation

The authors create a comprehensive dataset comprising both synthetic and real-world images. The synthetic data is generated by pairing images with induced reflection attributes, maintaining a realistic portrayal of real-world conditions. Importantly, a novel real-world dataset, captured under varied lighting, environments, and camera settings, provides a benchmark for quantitative assessment. This curated dataset, although modest in size, engenders robust evaluation of method efficacy across diverse scenarios.

Results

The method is demonstrated to surpass existing algorithms, including CEILNet, in standard metrics like PSNR and SSIM. Furthermore, it shows substantial improvement in perceptual studies conducted through a structured user paper, indicating favorable subjective quality.

The paper also highlights extensions of the method to other image enhancement tasks such as flare removal and dehazing, underscoring the model's versatility. These extensions are achieved without additional task-specific training, showcasing the model's potential for generalization across related image processing challenges.

Implications and Future Directions

The implications of this work are twofold: firstly, it provides an effective solution for single image reflection separation, with implications for fields requiring high-quality image processing, such as photography, surveillance, and augmented reality. Secondly, the method opens avenues for future research on layer separation tasks using perceptual losses, hinting at broader applicability in enhancement tasks across the AI image processing spectrum.

The paper suggests direction for future work in addressing limitations, particularly in scenarios where reflections are as acute as transmission details, posing a persistent challenge for separation algorithms. Continuous exploration in network training paradigms and dataset enrichment can potentially alleviate such limitations.

In summary, this paper makes a substantive contribution to the domain of reflection separation and, by extension, the broader field of image enhancement. It is a commendable synthesis of deep learning techniques and perceptual frameworks, paving the way for subsequent innovations in AI-driven image processing.

PDF Markdown