Do GANs actually learn the distribution? An empirical study (1706.08224v2)

Published 26 Jun 2017 in cs.LG

Abstract: Do GANS (Generative Adversarial Nets) actually learn the target distribution? The foundational paper of (Goodfellow et al 2014) suggested they do, if they were given sufficiently large deep nets, sample size, and computation time. A recent theoretical analysis in Arora et al (to appear at ICML 2017) raised doubts whether the same holds when discriminator has finite size. It showed that the training objective can approach its optimum value even if the generated distribution has very low support ---in other words, the training objective is unable to prevent mode collapse. The current note reports experiments suggesting that such problems are not merely theoretical. It presents empirical evidence that well-known GANs approaches do learn distributions of fairly low support, and thus presumably are not learning the target distribution. The main technical contribution is a new proposed test, based upon the famous birthday paradox, for estimating the support size of the generated distribution.

Citations (188)

View on Semantic Scholar

Summary

The paper introduces a novel birthday paradox test to quantify the support size of GAN-generated distributions, directly highlighting mode collapse issues.
The empirical analysis reveals that many established GAN variants generate outputs with significantly reduced diversity on datasets like CelebA and CIFAR-10.
The study finds that increasing discriminator capacity enhances sample diversity, suggesting pathways to mitigate mode collapse in GAN architectures.

Empirical Analysis of Distribution Learning in GANs

The paper presents an empirical investigation into the ability of Generative Adversarial Networks (GANs) to learn the target distribution, specifically addressing potential shortcomings inherent in current GAN implementations. Despite earlier theoretical results indicating that GANs should learn the target distribution under ideal conditions—such as large neural networks, sufficient sample sizes, and adequate computational resources—the authors examine whether these results hold true in scenarios where these conditions are not met.

The core problem explored in this paper is whether GANs suffer from mode collapse, a situation where the generated distribution lacks diversity and fails to accurately represent the training distribution. This is a critical issue as it implies that while GANs might excel at generating realistic-looking images, they might not capture the full diversity present in the training data.

Key Contributions

The authors introduce a novel test, rooted in the "birthday paradox", to empirically estimate the support size of distributions generated by GANs. This test evaluates whether the diversity of generated samples is reflective of the real, underlying distribution:

Birthday Paradox Test: By generating a sample and noting the probability of occurrence of duplicate items within this sample, the test provides a means to estimate the support size. The likelihood of duplicates is indicative of limited diversity in the generated images, suggesting mode collapse.
Empirical Findings: The results obtained through this methodology demonstrate that many well-established GAN variants produce distributions with significantly lower support size than the presumed target. For instance, tests performed using datasets like CelebA and CIFAR-10 illustrate that diverse images like human faces are not adequately captured, casting doubt on the generalization capabilities of these models.
Discriminator Capacity: An association between the capacity of the discriminator and the diversity of the generated distribution is observed. Increasing the discriminator’s capacity tends to enhance the diversity of generated samples, suggesting a path to ameliorate mode collapse through better-designed discriminators.

Implications and Future Directions

The findings assert that the typically employed GAN training objective might not inherently ensure sufficient distributional diversity, even though sample quality might be visually satisfactory. This presents a tangible challenge and encourages further exploration into designing GAN objectives that inherently prevent mode collapse.

Given that models such as ALI and BiGANs show promise in preserving better sample diversity, future research might delve into hybrid models or novel architectures that leverage adversarial training principles while incorporating mechanisms to enhance and evaluate sample diversity throughout the learning process.

Additionally, these insights are vital not just for theoretical validation but for practical applications where a robust understanding of data distribution is critical, such as image synthesis, anomaly detection, and other computer vision tasks. This research opens pathways for redefining success criteria for GANs from generating realistic samples to being comprehensive representations of the target data distribution. As the field evolves, ensuring that GANs reliably learn complete distributions remains a pivotal objective, necessitating continuous innovation in both GAN architectures and evaluation metrics.

PDF Markdown

Do GANs actually learn the distribution? An empirical study (1706.08224v2)

Summary

Empirical Analysis of Distribution Learning in GANs

Key Contributions

Implications and Future Directions

Related Papers