Learning Physics-guided Face Relighting under Directional Light (1906.03355v2)

Published 7 Jun 2019 in cs.CV, cs.GR, and cs.LG

Abstract: Relighting is an essential step in realistically transferring objects from a captured image into another environment. For example, authentic telepresence in Augmented Reality requires faces to be displayed and relit consistent with the observer's scene lighting. We investigate end-to-end deep learning architectures that both de-light and relight an image of a human face. Our model decomposes the input image into intrinsic components according to a diffuse physics-based image formation model. We enable non-diffuse effects including cast shadows and specular highlights by predicting a residual correction to the diffuse render. To train and evaluate our model, we collected a portrait database of 21 subjects with various expressions and poses. Each sample is captured in a controlled light stage setup with 32 individual light sources. Our method creates precise and believable relighting results and generalizes to complex illumination conditions and challenging poses, including when the subject is not looking straight at the camera.

Citations (108)

View on Semantic Scholar

Summary

The paper presents a two-stage framework that integrates physics-based diffuse rendering with a residual correction stage to enhance face relighting realism under directional light.
It leverages a novel dataset captured with a controlled light stage to train the model and demonstrates superior performance over traditional methods.
The hybrid approach bridges physical principles and neural networks, improving generalization and robustness for practical augmented reality applications.

Learning Physics-guided Face Relighting under Directional Light

The paper "Learning Physics-guided Face Relighting under Directional Light" introduces a sophisticated approach to the challenge of facial relighting for enhanced realism in augmented reality (AR) applications. The authors focus on developing an end-to-end deep learning framework capable of transforming face images to match varying directional lighting conditions, thereby enhancing the continuity and immersion in AR settings.

Approach Synopsis

The authors propose an innovative architecture that combines physically-based principles with modern deep learning techniques. Their model performs face relighting in two primary stages:

Diffuse Rendering Stage: The input face image is decomposed using a physics-based model to extract intrinsic properties such as albedo and normal maps. These intrinsic components are utilized to perform a diffuse rendering of the face under the target lighting conditions.
Non-Diffuse Residual Stage: Recognizing the shortcomings of the purely diffuse model, the authors introduce a second stage to account for more complex lighting phenomena such as specular highlights and shadowing. This stage predicts a residual to correct the diffuse output, thereby incorporating important non-diffuse effects which are vital for the realistic presentation of human faces.

Data and Training

The paper underlines the development of a new dataset comprising 21 subjects with diverse expressions and poses, captured under a controlled light stage with 32 individual light sources. This dataset is crucial to training and evaluating the performance of the proposed model. Ground truth data for albedo and shading term guidance are generated through standard photometric stereo techniques.

Experimental Evaluation

The results demonstrate that the physics-guided approach outperforms conventional learning-based methods lacking physical constraints. Notably, the proposed method accurately reproduces non-diffuse lighting effects such as sharp cast shadows and dynamic specular highlights. These effects are essential for realistic face rendering, particularly in 3D applications and AR experiences. Quantitatively, the paper reports metrics such as $L_1$ , $L_2$ , LPIPS, and DSSIM to showcase the model's superior performance over other baseline methods including direct neural network frameworks and traditional relighting techniques.

Theoretical and Practical Implications

The paper bridges the gap between parametric modeling and deep learning by integrating physics-based image rendering within a neural network architecture. This hybrid approach not only enhances the robustness of the system from limited training data but also improves generalization to unseen lighting scenarios. Moreover, it introduces the potential for manipulating intermediate representations for further visual tasks, enhancing the model's applicability in practical AR scenarios.

Future Directions

The work opens several avenues for future research. One such direction is the extension of this framework to handle even more complex lighting environments, beyond directional lighting. Another potential expansion is to apply the model to other scenarios, such as dynamic scenes and variable camera settings. Moreover, refining the model's handling of ambient and indirect illumination could be a valuable area of exploration. Integration with real-time AR systems could further validate the model’s effectiveness in practical applications.

In summary, this research presents a substantial step forward in face relighting, providing both a theoretical enhancement to the understanding of light-material interactions in neural networks and a practical tool for advancing realism in AR applications.

PDF Markdown

Related Papers

YouTube

Show All Videos