Improved Consistency Regularization for GANs (2002.04724v2)

Published 11 Feb 2020 in stat.ML and cs.LG

Abstract: Recent work has increased the performance of Generative Adversarial Networks (GANs) by enforcing a consistency cost on the discriminator. We improve on this technique in several ways. We first show that consistency regularization can introduce artifacts into the GAN samples and explain how to fix this issue. We then propose several modifications to the consistency regularization procedure designed to improve its performance. We carry out extensive experiments quantifying the benefit of our improvements. For unconditional image synthesis on CIFAR-10 and CelebA, our modifications yield the best known FID scores on various GAN architectures. For conditional image synthesis on CIFAR-10, we improve the state-of-the-art FID score from 11.48 to 9.21. Finally, on ImageNet-2012, we apply our technique to the original BigGAN model and improve the FID from 6.66 to 5.38, which is the best score at that model size.

PDF Abstract

Improved Consistency Regularization for GANs

Generative Adversarial Networks (GANs) have become a prominent class of deep generative models; however, they are notoriously difficult to train. Various strategies have been investigated to overcome these challenges, among which consistency regularization techniques have gained traction. In this paper, the authors propose a novel approach termed Improved Consistency Regularization (ICR) that enhances GAN performance by modifying existing consistency regularization methodologies.

Methodological Innovations

The paper presents two primary enhancements: Balanced Consistency Regularization (bCR) and Latent Consistency Regularization (zCR). In the traditional CR-GAN approach, consistency regularization is applied only between real images and their augmentations, leading to potential generation artifacts when only real images are augmented. The authors counter this issue by introducing bCR, where both real images and generated samples are subjected to consistency regularization, reducing augmentation-induced artifacts.

Additionally, the paper introduces zCR, which shifts the focus to the latent space. By perturbing the latent vectors, the authors encourage the generator to be sensitive to changes in latent vectors, while ensuring that the discriminator remains invariant to these perturbations. By combining these techniques, ICR achieves superior performance metrics.

Empirical Evaluation

The proposed ICR method demonstrates remarkable improvements in Fréchet Inception Distance (FID) scores across various benchmarks. In unconditional image synthesis, the authors report achieving the best-known FID scores for CIFAR-10 and CelebA datasets using several GAN architectures. Specifically, on CIFAR-10, the FID score improved from 11.48 to 9.21, and on ImageNet-2012, from 6.66 to 5.38.

In terms of conditional image synthesis, particularly with the BigGAN architecture, ICR outperforms the existing state-of-the-art, showcasing substantial gains in image fidelity.

Theoretical and Practical Implications

From a theoretical standpoint, the integration of bCR and zCR provides a more holistic form of consistency regularization, resolving prior issues with CR techniques and offering a more stable training paradigm for GANs. These improvements not only reduce artifacts but also enhance the diversity and quality of generated images.

Practically, ICR offers a straightforward augmentation to existing GAN frameworks that is both computationally efficient and less sensitive to hyper-parameter tuning, making it an attractive proposition for broad applications of GANs in areas such as image synthesis, augmentation, and even beyond.

Future Directions

This paper opens several avenues for future research and development. The successful implementation of ICR could inspire further enhancements to GAN training methodologies, such as exploring more sophisticated augmentation strategies in both input and latent spaces. Additionally, understanding the implications of ICR on other types of generative networks and extending its principles to multimodal generative tasks could provide substantial advancements in generative modeling.

In conclusion, the work encapsulates a significant stride forward in the domain of GAN training by addressing key deficiencies in existing regularization strategies, ultimately pushing the frontier of what is achievable with generative adversarial networks.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Zhengli Zhao (9 papers)
Sameer Singh (96 papers)
Honglak Lee (174 papers)
Zizhao Zhang (44 papers)
Augustus Odena (22 papers)
Han Zhang (338 papers)

Citations (140)

View on Semantic Scholar

Related Papers

Find Related Papers

YouTube

Show All Videos