Exposure: A White-Box Photo Post-Processing Framework (1709.09602v2)

Published 27 Sep 2017 in cs.GR and cs.CV

Abstract: Retouching can significantly elevate the visual appeal of photos, but many casual photographers lack the expertise to do this well. To address this problem, previous works have proposed automatic retouching systems based on supervised learning from paired training images acquired before and after manual editing. As it is difficult for users to acquire paired images that reflect their retouching preferences, we present in this paper a deep learning approach that is instead trained on unpaired data, namely a set of photographs that exhibits a retouching style the user likes, which is much easier to collect. Our system is formulated using deep convolutional neural networks that learn to apply different retouching operations on an input image. Network training with respect to various types of edits is enabled by modeling these retouching operations in a unified manner as resolution-independent differentiable filters. To apply the filters in a proper sequence and with suitable parameters, we employ a deep reinforcement learning approach that learns to make decisions on what action to take next, given the current state of the image. In contrast to many deep learning systems, ours provides users with an understandable solution in the form of conventional retouching edits, rather than just a "black-box" result. Through quantitative comparisons and user studies, we show that this technique generates retouching results consistent with the provided photo set.

Citations (267)

View on Semantic Scholar

Summary

The paper introduces a white-box framework that uses unpaired data and deep reinforcement learning to automate photo retouching.
It employs resolution-independent, differentiable filters with CNN and GAN components to mimic user editing styles.
Experimental results outperform baseline methods, offering transparency and practical insights for computational photography.

Understanding the "Exposure: A White-Box Photo Post-Processing Framework" Paper

The paper "Exposure: A White-Box Photo Post-Processing Framework" introduces a novel approach to automate photo retouching by leveraging unpaired data. This framework offers a versatile solution employing deep learning techniques to process and enhance RAW photos, aiming to replicate user-preferred aesthetic styles traditionally achieved through manual photo editing.

At its core, this framework diverges from conventional methods that rely heavily on paired data for supervised learning. Instead, it deploys a combination of Convolutional Neural Networks (CNNs) and Reinforcement Learning (RL), augmented by Generative Adversarial Networks (GANs) to automate and understand the retouching process. The system learns retouching operations from a set of unpaired photographs, reflecting personal preferences, making it easier for users to curate training datasets.

Methodology

A significant contribution of this research lies in the formulation of retouching operations using differentiable, resolution-independent filters. This requirement ensures that edits are computationally tractable on high-resolution images, circumventing the common limitations of many neural networks that are otherwise constrained to lower resolutions. Furthermore, the introduction of a white-box approach not only delivers visually appealing results but also elucidates the operation sequence applied—offering users transparency and control previously inaccessible with black-box systems.

The architecture employs a deep reinforcement learning component to sequence filter applications, essentially modeling the retouching process as a series of decision-making tasks. The GAN framework learns stylistic attributes by modeling personal retouching preferences extracted from the user's photo collection, aiding in the discernment of complex edits normally reproduced by human photographers.

Results and Validation

The results discussed reflect a comprehensive validation of the proposed framework through qualitative and quantitative methods. By comparing the system to baseline methods such as CycleGAN, which also leverages unpaired data but suffers from resolution constraints and visual artifacts, "Exposure" demonstrates superior performance. Through a variety of experiments on established datasets like the MIT-Adobe FiveK and custom collections from 500px.com, the paper indicates significant improvement in user ratings and histogram intersections (metrics for evaluating distribution similarities between output and target images).

The system's potential is further validated through user studies where it surpasses novice users in generating desirable edits. This underlines its practicality as an automated tool to both enhance ordinary users' photographic efforts and serve as a professional's tool for reverse engineering the stylistic processes embedded in complex filters.

Implications and Future Directions

The implications of this research are noteworthy. Practically, this framework serves as an efficient and effective photo post-processing tool for non-experts while offering insights into the operation of personal or commercial filters for advanced users. Theoretically, it opens avenues for future developments in AI-driven image processing by integrating interpretable deep learning frameworks. The authors hint at prospective enhancements, such as incorporating local retouching operations and developing more scalable deep learning architectures applicable to larger datasets or more complex editing tasks.

In conclusion, the paper delivers a substantial contribution to computational photography, presenting a method that elegantly balances usability and actionability—it provides an interface to apply complex, user-style-specific retouching operations automatically, with an underlying framework that lifts the "black-box" veil typical of current deep learning solutions. The system's demonstrated scalability, adaptability, and insightfulness offer fertile ground for future advancement in AI-aided photography and image editing domains.

PDF Markdown