Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis (1704.04086v2)

Published 13 Apr 2017 in cs.CV

Abstract: Photorealistic frontal view synthesis from a single face image has a wide range of applications in the field of face recognition. Although data-driven deep learning methods have been proposed to address this problem by seeking solutions from ample face data, this problem is still challenging because it is intrinsically ill-posed. This paper proposes a Two-Pathway Generative Adversarial Network (TP-GAN) for photorealistic frontal view synthesis by simultaneously perceiving global structures and local details. Four landmark located patch networks are proposed to attend to local textures in addition to the commonly used global encoder-decoder network. Except for the novel architecture, we make this ill-posed problem well constrained by introducing a combination of adversarial loss, symmetry loss and identity preserving loss. The combined loss function leverages both frontal face distribution and pre-trained discriminative deep face models to guide an identity preserving inference of frontal views from profiles. Different from previous deep learning methods that mainly rely on intermediate features for recognition, our method directly leverages the synthesized identity preserving image for downstream tasks like face recognition and attribution estimation. Experimental results demonstrate that our method not only presents compelling perceptual results but also outperforms state-of-the-art results on large pose face recognition.

Authors (4)

Rui Huang (129 papers)
Shu Zhang (286 papers)
Tianyu Li (101 papers)
Ran He (173 papers)

Citations (614)

View on Semantic Scholar

Summary

The paper introduces TP-GAN, a two-pathway GAN that synthesizes photorealistic frontal face images from profile views while preserving identity.
It employs a dual architecture with a global pathway for overall structure and local patch networks focused on key facial landmarks.
The model achieves superior performance on MultiPIE by integrating adversarial, symmetry, identity preserving, and pixel-wise losses for robust synthesis.

Global and Local Perception GAN for Frontal View Synthesis

The paper "Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis" presents a novel approach to synthesizing frontal views from single face images using a Two-Pathway Generative Adversarial Network (TP-GAN). This method addresses the challenge of generating photorealistic and identity-preserving frontal face images from profile perspectives, a problem characterized as ill-posed due to the variability and complexity involved.

Methodology Overview

TP-GAN consists of a dual-path architecture: a global pathway for the holistic face structure and local pathways focused on detailed textures around specific facial landmarks. This structure reflects human perceptual processes, effectively integrating broad structural understanding with intricate local details.

Global Pathway: Processes the overall facial structure using an encoder-decoder model. It includes a latent representation adapted for identity classification.
Local Pathways: Four patch networks centered on critical facial landmarks (eyes, nose, mouth) enhance local detail reconstruction.

The combined pathways feed into a GAN framework, leveraging adversarial loss to match the synthesized image distribution with real frontal faces, while symmetry and identity-preserving losses ensure structural coherence and retention of identity features.

Loss Functions

The composite loss function employed in TP-GAN integrates:

Adversarial Loss: Drives photorealism by aligning synthesized images with the distribution of actual frontal faces.
Symmetry Loss: Encourages symmetrical completion of occluded regions, crucial for large pose variations.
Identity Preserving Loss: Enforces match in latent feature space, securing identity consistency across variations.
Pixel-wise Loss: Ensures pixel-level synthesis fidelity across multi-scale outputs.

Results and Implications

Experimental assessments on the MultiPIE dataset indicate TP-GAN's superior performance in generating clear, identity-preserving frontal faces from large pose angles, surpassing state-of-the-art techniques. Recognized for preserving key identity attributes even across extreme pose transitions, the model demonstrates robustness to pose and illumination variability.

The research positions TP-GAN as a practical tool for enhancing face analysis applications, including recognition and attribute estimation, without relying on intermediate features. The findings underscore the potential of GANs in addressing ill-posed vision problems through model-driven yet data-informed strategies.

Future Directions

Potential avenues for future exploration include refining local detail synthesis further, ensuring scalability to in-the-wild datasets, and adapting this architecture for real-time applications. The paradigm of "recognition via generation" indicated by TP-GAN could reshape approaches to facial analysis tasks within constrained data environments, offering a new lens through which face synthesis and recognition can interoperate seamlessly.

This work’s contribution lies in its balance of architectural novelty and effective constraint integration, promising advancements in both theoretical understanding and application of GANs in complex image synthesis tasks.

PDF Markdown