GazeDirector: Fully Articulated Eye Gaze Redirection in Video (1704.08763v1)

Published 27 Apr 2017 in cs.CV

Abstract: We present GazeDirector, a new approach for eye gaze redirection that uses model-fitting. Our method first tracks the eyes by fitting a multi-part eye region model to video frames using analysis-by-synthesis, thereby recovering eye region shape, texture, pose, and gaze simultaneously. It then redirects gaze by 1) warping the eyelids from the original image using a model-derived flow field, and 2) rendering and compositing synthesized 3D eyeballs onto the output image in a photorealistic manner. GazeDirector allows us to change where people are looking without person-specific training data, and with full articulation, i.e. we can precisely specify new gaze directions in 3D. Quantitatively, we evaluate both model-fitting and gaze synthesis, with experiments for gaze estimation and redirection on the Columbia gaze dataset. Qualitatively, we compare GazeDirector against recent work on gaze redirection, showing better results especially for large redirection angles. Finally, we demonstrate gaze redirection on YouTube videos by introducing new 3D gaze targets and by manipulating visual behavior.

Citations (53)

View on Semantic Scholar

Summary

The paper GazeDirector introduces a model-fitting approach using analysis-by-synthesis for fully articulated eye gaze redirection in video without requiring person-specific data.
GazeDirector constructs a multi-part eye model and uses iterative optimization to track and alter gaze, recovering eye shape, texture, pose, and gaze details for accurate redirection, even for large angles.
Evaluation shows GazeDirector outperforms warping methods in minimizing artifacts and has significant implications for VR, assistive technologies, and post-production by enabling precise gaze control.

Overview of GazeDirector: Fully Articulated Eye Gaze Redirection in Video

The paper "GazeDirector: Fully Articulated Eye Gaze Redirection in Video" introduces an innovative approach to eye gaze manipulation, proposing a new methodology called GazeDirector. This approach distinguishes itself from previous techniques by employing a comprehensive model-fitting process utilizing analysis-by-synthesis to adjust eye gaze direction in video sequences without requiring person-specific training data. This capability is particularly important as it enables redirection of gaze to precisely specified new directions in three-dimensional space, a significant advancement over earlier methods limited to angular offsets.

Technical Contributions

The authors present several contributions with GazeDirector. The primary advancement is the construction of a multi-part eye region model that integrates facial and eyeball components to track and alter gaze. Unlike related techniques such as DeepWarp, which directly predict image-warping flow fields, GazeDirector recovers intricate details of eye shape, texture, pose, and gaze through iterative model-fitting. This ensures the ability to re-direct gaze accurately even for large angular displacements, reducing artefacts typically associated with warping methods.

Specifically, GazeDirector consists of two stages: eye region tracking and gaze redirection. The tracking involves fitting a 3D model to the video frames by minimizing a reconstruction energy that considers parameters of shape, texture, pose, and illumination. Through a combination of procedural animation and numerical optimization, the model adapts to the nuances of the eye region, providing robust tracking results. For gaze redirection, the synthesized model is posed to reflect new desired gaze targets, and eyelid motion is simulated by employing a model-derived optical flow field, which is composited onto the output image seamlessly.

Evaluation and Results

The authors conduct quantitative and qualitative evaluations, confirming the efficacy of GazeDirector. Quantitatively, the method is evaluated against the Columbia gaze dataset, demonstrating competitive gaze estimation performance without the need for dataset-specific training, an aspect where it surpasses some previous approaches in terms of accuracy for gaze yaw estimation. Furthermore, during gaze redirection experiments, GazeDirector is shown to effectively minimize image differences compared to ground truth conditions, outperforming warping-based approaches and addressing their smudging artefact limitations.

Qualitative assessments include comparisons with recent methods and application examples like gaze manipulation in YouTube videos—illustrating practical use cases such as altering a person's gaze direction convincingly across video frames. These outcomes highlight the strength of GazeDirector in handling both gaze estimation and redirection under diverse conditions.

Implications and Future Directions

The implications of GazeDirector are significant for numerous applications within computer vision and multimedia processing. The ability to control eye gaze direction post-capture can greatly enhance virtual reality experiences, assistive technologies, and media post-production processes. For instance, actors can have their gaze adjusted in post-production to match the placement of computer-generated characters, a task conventionally challenging due to spatial inconsistencies.

Theoretically, GazeDirector contributes to advancing model-based tracking methodologies, enhancing the understanding of face and eye articulation in video contexts through robust 3D modeling techniques. Future developments could further integrate facial expressions and environmental interactions, incorporating elements like eyeglasses for more generalized applications. Such advancements would expand the capabilities of AI systems in comprehensively understanding and manipulating human visual behavior in dynamic environments.

In conclusion, the proposed GazeDirector method marks a considerable step forward in eye gaze manipulation technology, promising numerous possibilities for research progression and practical adoption in fields requiring precise visual adjustments without extensive user-specific calibration efforts.

Related Papers

YouTube

Show All Videos