Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Training GANs with Stronger Augmentations via Contrastive Discriminator (2103.09742v1)

Published 17 Mar 2021 in cs.LG, cs.AI, and cs.CV

Abstract: Recent works in Generative Adversarial Networks (GANs) are actively revisiting various data augmentation techniques as an effective way to prevent discriminator overfitting. It is still unclear, however, that which augmentations could actually improve GANs, and in particular, how to apply a wider range of augmentations in training. In this paper, we propose a novel way to address these questions by incorporating a recent contrastive representation learning scheme into the GAN discriminator, coined ContraD. This "fusion" enables the discriminators to work with much stronger augmentations without increasing their training instability, thereby preventing the discriminator overfitting issue in GANs more effectively. Even better, we observe that the contrastive learning itself also benefits from our GAN training, i.e., by maintaining discriminative features between real and fake samples, suggesting a strong coherence between the two worlds: good contrastive representations are also good for GAN discriminators, and vice versa. Our experimental results show that GANs with ContraD consistently improve FID and IS compared to other recent techniques incorporating data augmentations, still maintaining highly discriminative features in the discriminator in terms of the linear evaluation. Finally, as a byproduct, we also show that our GANs trained in an unsupervised manner (without labels) can induce many conditional generative models via a simple latent sampling, leveraging the learned features of ContraD. Code is available at https://github.com/jh-jeong/ContraD.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Jongheon Jeong (26 papers)
  2. Jinwoo Shin (196 papers)
Citations (62)

Summary

An Analysis of "Training GANs with Stronger Augmentations via Contrastive Discriminator"

The paper "Training GANs with Stronger Augmentations via Contrastive Discriminator," authored by Jongheon Jeong and Jinwoo Shin, presents an innovative approach to stabilizing Generative Adversarial Networks (GANs) through the integration of contrastive representation learning into the discriminator. This technique, coined as the Contrastive Discriminator (ContraD), leverages the principles of contrastive learning to allow GAN discriminators to handle a broader range of strong data augmentations without succumbing to overfitting, traditional challenges in GAN training.

Key Contributions

The authors propose several key contributions through their work:

  1. Integration of Contrastive Learning: By embedding contrastive learning paradigms within the GAN discriminator, ContraD capitalizes on strong data augmentations typically used in contrastive learning, such as those from SimCLR. This integration aims to maintain discriminative features while effectively distinguishing between real and generated samples.
  2. Robustness to Augmentation: The paper challenges prior assumptions that only limited data augmentations (like flipping and translating) are beneficial for GAN training by demonstrating that ContraD can robustly handle much stronger augmentations.
  3. Mutual Benefits: The research identifies mutual benefits for both GAN training and contrastive feature learning when integrated. Specifically, the discriminative features maintained by contrastive learning enhance the discriminator's ability to classify real and synthetic images, while likewise, the discriminator dynamics aid in improving contrastive representations.
  4. Performance Gains: The authors report consistent improvements in Fréchet Inception Distance (FID) and Inception Score (IS) across various datasets and architectures when using ContraD over other augmentation-based stabilization techniques like bCR and DiffAug.
  5. Conditional Model Induction via Self-Supervised Training: As a derivative result, the authors illustrate that the contrastive representations learned through ContraD can facilitate the derivation of conditional generative models from unsupervised GANs, offering pathways to conditional sampling using learned linear projections as conditional vectors.

Experimental Validation

The authors conduct thorough experimental validation on diverse datasets, including CIFAR-10/100, CelebA-HQ, AFHQ, and ImageNet, using well-known GAN architectures: SNDCGAN, StyleGAN2, and BigGAN. The results highlight substantial improvements:

  • CIFAR-10 and CIFAR-100: ContraD significantly outperforms both contemporary augmentation methods and state-of-the-art GAN architectures like StyleGAN2 in terms of FID.
  • Higher Resolution Datasets: On AFHQ and CelebA-HQ datasets, ContraD consistently reduces FID scores, indicating enhanced image quality and model stability.
  • Linear Evaluation: The effectiveness of contrastive representation is highlighted by notable improvements in linear classification accuracy on learned representations, outperforming traditional SimCLR approaches.

Implications and Future Work

The integration of contrastive learning into GANs presents significant implications for improving the stability and performance of generative models across varying augmentation strengths. The coherence found between contrastive learning metrics and GAN discriminator objectives suggests a promising direction for future research, where further exploration could enhance both generative and representational learning tasks. Additionally, extending the idea to larger-scale, high-resolution datasets could offer new benchmarks for GAN performance.

In conclusion, the paper presents a compelling case for incorporating contrastive principles into GAN training, pushing the boundaries of data augmentation use and improving both model stability and sample quality. As data requirements and model complexities continue to grow, the methodology and insights from this research could serve as a valuable framework for developing more resilient and representative generative models in the field of AI.