Improving the Improved Training of Wasserstein GANs: A Consistency Term and Its Dual Effect (1803.01541v1)

Published 5 Mar 2018 in cs.CV, cs.LG, and stat.ML

Abstract: Despite being impactful on a variety of problems and applications, the generative adversarial nets (GANs) are remarkably difficult to train. This issue is formally analyzed by \cite{arjovsky2017towards}, who also propose an alternative direction to avoid the caveats in the minmax two-player training of GANs. The corresponding algorithm, called Wasserstein GAN (WGAN), hinges on the 1-Lipschitz continuity of the discriminator. In this paper, we propose a novel approach to enforcing the Lipschitz continuity in the training procedure of WGANs. Our approach seamlessly connects WGAN with one of the recent semi-supervised learning methods. As a result, it gives rise to not only better photo-realistic samples than the previous methods but also state-of-the-art semi-supervised learning results. In particular, our approach gives rise to the inception score of more than 5.0 with only 1,000 CIFAR-10 images and is the first that exceeds the accuracy of 90% on the CIFAR-10 dataset using only 4,000 labeled images, to the best of our knowledge.

Citations (252)

View on Semantic Scholar

Summary

The paper introduces a new consistency term that enforces Lipschitz continuity using dropout-induced perturbations over the data manifold.
It leverages rigorous experiments on MNIST and CIFAR-10, significantly boosting inception scores and mitigating overfitting.
The approach enhances semi-supervised learning, achieving competitive test accuracy with limited labeled samples.

Improving the Improved Training of Wasserstein GANs: A Consistency Term and Its Dual Effect

The paper "Improving the Improved Training of Wasserstein GANs: A Consistency Term and Its Dual Effect" presents a refined methodology to enhance the training stability and performance of Wasserstein GANs (WGANs), which are recognized for their capacity to produce high-quality samples in generative models. The authors identify and address the intrinsic complexities of training GANs, specifically those associated with enforcing the 1-Lipschitz continuity, which is critical for the effective performance of WGANs.

Context and Approach

Wasserstein GANs were introduced as a notable development in the domain of generative models, offering theoretical improvements over standard GANs by using the Wasserstein distance as a better approximation for training objectives. Although WGANs mitigate some of the mode collapse issues of traditional GANs, they present their own challenges, primarily around maintaining the Lipschitz condition which is crucial for stable learning. Traditional approaches to enforce this condition, such as weight clipping, are often inadequate as they may constrain the model capacity and lead to vanishing gradients. Improved training methods like the gradient penalty (GP) address this by applying penalties to the gradients but are limited as they do not impose the Lipschitz constraint uniformly over the data manifold.

In addressing these limitations, the authors propose an alternative regularization strategy based on a new consistency term (CT) that aims to enforce the Lipschitz continuity more comprehensively across the data distribution. This involves perturbing real data samples and evaluating the discriminator's responses to ensure consistent outputs — effectively minimizing discrepancies between perturbed instances that are proximal in the data manifold. The proposed method diverges from traditional penalty applications by focusing on multiple points around the data manifold, specifically using dropout-induced perturbations on the discriminator's hidden layers.

Empirical Results

The enhancements proposed in this paper are grounded in rigorous empirical evaluations using benchmark datasets like MNIST and CIFAR-10. The authors demonstrate that their CT-enhanced method yields superior inception scores (e.g., a significant increase from 2.98 to 5.13 on CIFAR-10 with a small CNN) and effectively reduces overfitting, as evidenced by consistent convergence on test datasets. Furthermore, the CT-GAN delivers competitive results in semi-supervised learning tasks, achieving classification accuracy improvements that surpass existing GAN-based methods, exemplified by achieving a test error rate of 9.98% on CIFAR-10 with only 4,000 labeled samples.

Implications and Future Work

The introduction of a consistency regularization framework reflects a significant step forward in stabilizing WGAN training. By better balancing the discriminator's learning across the data manifold, the approach not only enhances sample quality but also broadens the applicability of WGANs to semi-supervised tasks.

Looking forward, the implications of this work suggest avenues for further research in optimizing consistency enforcement mechanisms and exploring their integration with larger architectures and more complex datasets. Additionally, the notion of leveraging dropout perturbations could be extrapolated to other models seeking to maintain functional continuity across training iterations, thereby contributing to broader advancements in robust model training methodologies. As the field progresses, further exploration into dynamic and adaptive regularization strategies, possibly informed by real-time model performance metrics, could continue to push the boundaries of generative adversarial learning.

PDF Markdown