Learning Illumination from Diverse Portraits (2008.02396v1)

Published 5 Aug 2020 in cs.CV and cs.GR

Abstract: We present a learning-based technique for estimating high dynamic range (HDR), omnidirectional illumination from a single low dynamic range (LDR) portrait image captured under arbitrary indoor or outdoor lighting conditions. We train our model using portrait photos paired with their ground truth environmental illumination. We generate a rich set of such photos by using a light stage to record the reflectance field and alpha matte of 70 diverse subjects in various expressions. We then relight the subjects using image-based relighting with a database of one million HDR lighting environments, compositing the relit subjects onto paired high-resolution background imagery recorded during the lighting acquisition. We train the lighting estimation model using rendering-based loss functions and add a multi-scale adversarial loss to estimate plausible high frequency lighting detail. We show that our technique outperforms the state-of-the-art technique for portrait-based lighting estimation, and we also show that our method reliably handles the inherent ambiguity between overall lighting strength and surface albedo, recovering a similar scale of illumination for subjects with diverse skin tones. We demonstrate that our method allows virtual objects and digital characters to be added to a portrait photograph with consistent illumination. Our lighting inference runs in real-time on a smartphone, enabling realistic rendering and compositing of virtual objects into live video for augmented reality applications.

Citations (24)

View on Semantic Scholar

Summary

The paper introduces a novel supervised learning method using a multi-scale neural network to predict HDR lighting from LDR portraits.
It employs a comprehensive light stage dataset and a novel non-negative least squares solver to reconstruct clipped light intensities.
It demonstrates improved rendering for diverse skin tones and material BRDFs, significantly advancing AR and visual effects applications.

Overview of "Learning Illumination from Diverse Portraits"

The paper entitled "Learning Illumination from Diverse Portraits," authored by researchers from Google and presented herein, introduces a novel learning-based method for estimating high dynamic range (HDR), omnidirectional lighting from a single low dynamic range (LDR) portrait. This approach addresses the challenges posed by the inconsistency between real-world lighting and the lighting used for rendering virtual content, a significant factor in both augmented reality (AR) and visual effects production.

Methodology

The researchers employ a supervised learning approach, training their model on a dataset generated by relighting 70 subjects captured in a light stage, paired with a vast collection of HDR lighting environments. This synthesis of data leverages light stages to record the reflectance field of subjects, allowing for photo-realistic illumination consistent with the methods of image-based relighting. The ground truth for the environmental illumination is obtained by promoting LDR captures to HDR through a novel non-negative least squares solver formulation that reconstructs clipped light source intensities missing in traditional methods.

The proposed neural network employs a multi-scale architecture, trained with rendering-based loss functions and an adversarial loss to estimate plausible high frequency lighting details. The use of multi-scale adversarial loss is a key innovation that enables the estimation of lighting environments that are better suited for rendering diverse materials, beyond purely Lambertian surfaces.

Results

Quantitatively, the proposed method surpasses the state-of-the-art techniques in portrait-based lighting estimation, such as the method by Sun et al., and the second order spherical harmonics (SH) approximations. It achieves superior performance in rendering realistic appearances for materials with diverse BRDFs, as shown through lighting comparisons involving diffuse, matte silver, and mirror spheres. Importantly, the method demonstrates stable performance across diverse skin tones, mitigating the intrinsic ambiguity between surface reflectance and light source strength, an issue that earlier techniques have not explicitly resolved.

Implications and Future Directions

The practical implications of this work are notable, particularly in the domains of AR and visual effects, where consistent illumination of virtual objects with real-world subjects is crucial. By enabling real-time inference on mobile devices, this method allows for realistic rendering and compositing of virtual objects within live videos, advancing AR applications significantly.

Theoretically, this paper highlights the importance of multi-scale learning and adversarial losses in enhancing high frequency details in estimated lighting. Future research could explore extending this approach to non-distant lighting conditions or incorporating additional variances such as dynamic environmental changes. Furthermore, augmenting the model to handle subjects with a wider range of expressions and accessories could yield even more robust lighting estimations.

In conclusion, "Learning Illumination from Diverse Portraits" represents a significant advance in automatic lighting estimation capabilities, offering a powerful tool for realistic compositing in AR and visual effects, with broader implications for real-world applications.

PDF Markdown

Related Papers

YouTube

Show All Videos