Differentiable Augmentation for Data-Efficient GAN Training (2006.10738v4)

Published 18 Jun 2020 in cs.CV, cs.GR, and cs.LG

Abstract: The performance of generative adversarial networks (GANs) heavily deteriorates given a limited amount of training data. This is mainly because the discriminator is memorizing the exact training set. To combat it, we propose Differentiable Augmentation (DiffAugment), a simple method that improves the data efficiency of GANs by imposing various types of differentiable augmentations on both real and fake samples. Previous attempts to directly augment the training data manipulate the distribution of real images, yielding little benefit; DiffAugment enables us to adopt the differentiable augmentation for the generated samples, effectively stabilizes training, and leads to better convergence. Experiments demonstrate consistent gains of our method over a variety of GAN architectures and loss functions for both unconditional and class-conditional generation. With DiffAugment, we achieve a state-of-the-art FID of 6.80 with an IS of 100.8 on ImageNet 128x128 and 2-4x reductions of FID given 1,000 images on FFHQ and LSUN. Furthermore, with only 20% training data, we can match the top performance on CIFAR-10 and CIFAR-100. Finally, our method can generate high-fidelity images using only 100 images without pre-training, while being on par with existing transfer learning algorithms. Code is available at https://github.com/mit-han-lab/data-efficient-gans.

PDF Abstract

An Expert Review of "Differentiable Augmentation for Data-Efficient GAN Training"

The paper "Differentiable Augmentation for Data-Efficient GAN Training" presents a novel approach, Differentiable Augmentation (DiffAugment), aimed at enhancing the data efficiency of Generative Adversarial Networks (GANs). This research addresses a significant challenge in GANs: the performance degradation caused by limited training data. The authors identify that the discriminator often memorizes training data, leading to poor generalization and instability.

Key Contributions

The primary contribution of this paper is the introduction of DiffAugment, a technique that applies differentiable augmentations to both real and generated samples during GAN training. This approach contrasts with previous attempts that altered the distribution of real images, often resulting in distribution shifts and minimal benefits. DiffAugment mitigates overfitting by regularizing the discriminator without shifting the target distribution and maintains the balance of GAN training dynamics.

Methodology

The authors employ several differentiable augmentations, such as translation, cutout, and color manipulations, applied uniformly to both real and fake images. They provide mathematical formulations for discriminator and generator training using these augmentations, with a focus on maintaining differentiability to ensure proper gradient flow.

Numerical Results

The paper reports substantial improvements in GAN performance across various architectures and datasets. Notably:

On ImageNet (128x128 resolution), DiffAugment achieves an FID of 6.80 and IS of 100.8, surpassing previous methods without using the truncation trick.
When using only 1,000 images from FFHQ and LSUN datasets, DiffAugment reduces the FID by 2-4 times compared to baselines.
With just 20% of CIFAR-10 and CIFAR-100 data, performance is comparable to top benchmarks, demonstrating the method's data efficiency.

Implications

Practically, this technique enables the generation of high-fidelity images with substantially fewer data, which has significant implications for areas where data collection is challenging or costly. Theoretically, DiffAugment provides insights into GAN regularization strategies, emphasizing the necessity of balancing generator and discriminator training.

Future Directions

The proposed method opens avenues for further exploration of differentiable augmentations and their integration into other generative models. Future research could investigate adaptive augmentation strategies tailored to specific datasets or tasks.

In conclusion, this paper presents a noteworthy advancement in data-efficient GAN training, offering a robust solution to overfitting while maintaining model performance. The results suggest a promising direction for developing generative models that are less reliant on extensive datasets, potentially broadening the applicability of GANs in various domains.