Towards Open-Set Identity Preserving Face Synthesis (1803.11182v2)

Published 29 Mar 2018 in cs.CV

Abstract: We propose a framework based on Generative Adversarial Networks to disentangle the identity and attributes of faces, such that we can conveniently recombine different identities and attributes for identity preserving face synthesis in open domains. Previous identity preserving face synthesis processes are largely confined to synthesizing faces with known identities that are already in the training dataset. To synthesize a face with identity outside the training dataset, our framework requires one input image of that subject to produce an identity vector, and any other input face image to extract an attribute vector capturing, e.g., pose, emotion, illumination, and even the background. We then recombine the identity vector and the attribute vector to synthesize a new face of the subject with the extracted attribute. Our proposed framework does not need to annotate the attributes of faces in any way. It is trained with an asymmetric loss function to better preserve the identity and stabilize the training process. It can also effectively leverage large amounts of unlabeled training face images to further improve the fidelity of the synthesized faces for subjects that are not presented in the labeled training face dataset. Our experiments demonstrate the efficacy of the proposed framework. We also present its usage in a much broader set of applications including face frontalization, face attribute morphing, and face adversarial example detection.

Authors (5)

Jianmin Bao (65 papers)
Dong Chen (220 papers)
Fang Wen (42 papers)
Houqiang Li (236 papers)
Gang Hua (101 papers)

Citations (242)

View on Semantic Scholar

Summary

The paper introduces a GAN-based framework that disentangles facial identity and attributes to synthesize realistic faces in an open-set environment.
It employs an asymmetric loss configuration and unsupervised learning to stabilize training and preserve identity features without extensive annotation.
Experimental results confirm high fidelity in generating unseen identities, enabling applications like face frontalization and adversarial detection in verification systems.

Towards Open-Set Identity Preserving Face Synthesis

The paper by Bao et al. introduces a novel framework utilizing Generative Adversarial Networks (GANs) to achieve identity-preserving face synthesis while operating in an open-set environment. This framework fundamentally advances current techniques by enabling the synthesis of faces with identities not included in the training dataset, addressing a critical limitation of existing models.

Framework and Methodology

This paper proposes a GAN-based system designed to disentangle facial identity and attributes, thereby permitting the recombination of these elements to generate a synthetic face image consistent with a specific identity and desired attributes (such as pose, emotion, illumination, etc.). This decomposition eliminates the need for extensive annotation of facial attributes, a drawback of many prior approaches that necessitate exhaustive manual labeling.

The architecture of the proposed system is composed of five key components: an identity encoder, an attribute encoder, a generator network, a classification network, and a discriminator network. Each network plays a distinct role in the end-to-end training process. The identity and attribute encoders extract identity-related and attribute-related information from faces respectively. The generator uses these encoded vectors to synthesize realistic face images. The classification network ensures identity preservation in the synthesized images, while the discriminator differentiates between real and generated images.

Asymmetric Loss Function and Training Stability

A significant innovation in the training of this GANs-based system is the adoption of an asymmetric loss configuration. Instead of following the traditional GAN objective of binary discrimination, the authors propose using cross-entropy loss for the discriminator and classification network while employing a pairwise feature matching loss for the generator. This approach stabilizes the GAN training process and better preserves identity features during synthesis.

Unsupervised Learning and Open-Set Synthesis

The paper also explores an unsupervised learning component that leverages unlabeled data to improve the diversity and fidelity of the synthesized faces, especially for identities absent in the original dataset. This aspect is crucial for open-set identity synthesis as it allows the model to generalize beyond the constraints of the fixed training set.

Experimental Evaluation and Applications

Experimental results are presented to demonstrate the efficacy of the proposed framework. Quantitative evaluations are performed using face identification tasks on both seen and unseen identities, indicating that the synthesized images retain high fidelity in both cases. Qualitative experiments further illustrate the system's ability to convincingly perform tasks such as face frontalization and attribute morphing—transforming one facial attribute to another while maintaining identity.

An interesting application discussed is the potential of using this framework for the detection of adversarial examples in face verification systems. By comparing identity preservation between original images and their GAN-synthesized counterparts, the framework provides a means of identifying subtle adversarial manipulations in facial features, adding a layer of security to biometric verification systems.

Theoretical and Practical Implications

Theoretically, this research provides a robust strategy for disentangling identity and attributes in an unsupervised manner, bypassing the limitations imposed by the need for exhaustive attribute labeling. Practically, the framework enhances the versatility of face synthesis applications, enabling more extensive real-world usage scenarios—ranging from privacy-preserving data augmentation in machine learning pipelines to the proactive defense against adversarial perturbations in security systems.

Future Directions

While the framework offers significant improvements over existing solutions, further exploration could involve extending these techniques to other complex datasets (beyond facial imagery) and exploring potential improvements in synthesis quality and processing efficiency. Additionally, refining the integration of unlabeled datasets to enhance open-set generalization capabilities remains a promising avenue for future research.

PDF Markdown