Triple consistency loss for pairing distributions in GAN-based face synthesis (1811.03492v1)

Published 8 Nov 2018 in cs.CV

Abstract: Generative Adversarial Networks have shown impressive results for the task of object translation, including face-to-face translation. A key component behind the success of recent approaches is the self-consistency loss, which encourages a network to recover the original input image when the output generated for a desired attribute is itself passed through the same network, but with the target attribute inverted. While the self-consistency loss yields photo-realistic results, it can be shown that the input and target domains, supposed to be close, differ substantially. This is empirically found by observing that a network recovers the input image even if attributes other than the inversion of the original goal are set as target. This stops one combining networks for different tasks, or using a network to do progressive forward passes. In this paper, we show empirical evidence of this effect, and propose a new loss to bridge the gap between the distributions of the input and target domains. This "triple consistency loss", aims to minimise the distance between the outputs generated by the network for different routes to the target, independent of any intermediate steps. To show this is effective, we incorporate the triple consistency loss into the training of a new landmark-guided face to face synthesis, where, contrary to previous works, the generated images can simultaneously undergo a large transformation in both expression and pose. To the best of our knowledge, we are the first to tackle the problem of mismatching distributions in self-domain synthesis, and to propose "in-the-wild" landmark-guided synthesis. Code will be available at https://github.com/ESanchezLozano/GANnotation

Citations (23)

View on Semantic Scholar

Summary

The paper introduces a triple consistency loss that aligns image distributions for consistent progressive face synthesis.
It integrates the loss into a landmark-guided framework (GANnotation) to modify facial expression and pose while preserving identity.
Comparative evaluations against StarGAN show fewer artifacts and enhanced realism in progressive image translation tasks.

An Academic Overview of Triple Consistency Loss for GAN-Based Face Synthesis

This research paper focuses on advancing the understanding and application of Generative Adversarial Networks (GANs) in the domain of face synthesis, particularly through a novel loss function. Enrique Sanchez and Michel Valstar propose a "triple consistency loss" aimed at addressing a key limitation observed in prior GAN-based methods for face-to-face translation. These methods typically employ a self-consistency loss ensuring that, when an output image is fed back into the network with inverted target attributes, the network can regenerate the original input. While this approach has led to photo-realistic results, it has been identified that it does not align the input and target distributions as closely as desired. The paper provides empirical evidence of this mismatch and presents a method to resolve it.

Summary of Contributions

Triple Consistency Loss: The core contribution of the paper is the introduction of a triple consistency loss, which aims to align input and output distributions by enforcing that images generated through different paths (either directly or via intermediaries) should be consistent with each other. This novel loss function enables the generated images to remain consistent even after multiple forward passes, which is crucial for achieving progressive image translation.
Application to Face Synthesis: The authors incorporate the triple consistency loss into a novel framework called GANnotation, an approach for landmark-guided face-to-face synthesis. Unlike previous methods restricted to appearance changes, GANnotation allows simultaneous transformation in both expression and pose. The model excels in synthesizing person-specific datasets with minimal supervision by mapping faces according to target landmarks. It thus facilitates applications in face alignment and dataset augmentation.
Evaluation and Comparative Analysis: Through comparative studies, particularly with StarGAN, the paper highlights the efficacy of triple consistency loss in maintaining the quality of progressive translations. The retrained StarGAN with triple consistency loss surpasses the original in maintaining output realism, as evidenced by fewer artifacts and improved attribute rendition in progressive translation scenarios.
Addressing Identity Preservation: While achieving progressive synthesis capabilities, the triple consistency framework still honors identity preservation, a crucial feature when conducting any face manipulation task. Furthermore, the architecture adapts known approaches such as mask-based generation for added realism.

Theoretical Implications and Future Research Directions

The introduction of triple consistency loss represents a meaningful step toward addressing distribution mismatches in image-to-image translation. By effectively pairing distributions, this approach could alleviate long-standing issues in generating reusable images for subsequent transformations, which is vital in domains requiring continuous image editing or attribute modification.

Future work could extend the triple consistency concept to other domains beyond facial synthesis, potentially addressing similar distribution alignment challenges in broader GAN applications, such as art style transfer or medical imaging transformations. Furthermore, the interplay between triple consistency and self-consistency losses opens avenues for a potential hybridized methodology that adeptly balances identity preservation and distribution alignment without compromising on either.

In conclusion, this paper shows that by leveraging a novel loss function, it is possible to fundamentally improve the robustness and efficacy of GAN-based synthesis systems. The broad implications suggest a rethinking of loss paradigms to better support progressive and multi-step transformations across various machine learning tasks.

PDF Markdown

Related Papers

GitHub

GitHub - ESanchezLozano/GANnotation: GANnotation (PyTorch): Landmark-guided face to face synthesis using GANs (And a triple consistency loss!) (194 stars)

Tweets

https://twitter.com/ESanchezLozano/status/1062007530522599433

YouTube

Show All Videos