DeLiGAN : Generative Adversarial Networks for Diverse and Limited Data (1706.02071v1)

Published 7 Jun 2017 in cs.CV

Abstract: A class of recent approaches for generating images, called Generative Adversarial Networks (GAN), have been used to generate impressively realistic images of objects, bedrooms, handwritten digits and a variety of other image modalities. However, typical GAN-based approaches require large amounts of training data to capture the diversity across the image modality. In this paper, we propose DeLiGAN -- a novel GAN-based architecture for diverse and limited training data scenarios. In our approach, we reparameterize the latent generative space as a mixture model and learn the mixture model's parameters along with those of GAN. This seemingly simple modification to the GAN framework is surprisingly effective and results in models which enable diversity in generated samples although trained with limited data. In our work, we show that DeLiGAN can generate images of handwritten digits, objects and hand-drawn sketches, all using limited amounts of data. To quantitatively characterize intra-class diversity of generated samples, we also introduce a modified version of "inception-score", a measure which has been found to correlate well with human assessment of generated samples.

Citations (257)

View on Semantic Scholar

Summary

The paper introduces a reparameterized latent space using a mixture of Gaussians to enhance diversity with limited training data.
DeLiGAN's architecture, validated on MNIST, CIFAR-10, and sketch datasets, outperforms traditional GANs in generating realistic samples.
The modified inception score provides a robust metric that correlates with human evaluation to measure sample diversity effectively.

Overview of DeLiGAN: Generative Adversarial Networks for Diverse and Limited Data

The paper "DeLiGAN: Generative Adversarial Networks for Diverse and Limited Data" addresses a significant limitation of traditional Generative Adversarial Networks (GANs): their dependence on large datasets to generate diverse and realistic images. The authors propose a novel approach, DeLiGAN, which modifies the conventional GAN framework to effectively handle scenarios where the training data is both diverse and limited.

Key Contributions and Methodology

Reparameterization of Latent Space:
- The DeLiGAN model introduces a significant modification by reparameterizing the latent space as a mixture of Gaussian models. This change enhances the ability of the generator to produce diverse outputs even with constrained data.
- Instead of sampling from a simple latent distribution, DeLiGAN samples from a mixture, enabling it to capture complex data distributions without requiring vast amounts of data.
Model Architecture and Experimental Validation:
- The paper outlines the architectural choices for the DeLiGAN model, which include the addition of a mixture of Gaussian layer in the latent space.
- Experimental validation is conducted across multiple data modalities, including handwritten digits (MNIST), colored photo objects (CIFAR-10), and hand-drawn sketches. In each case, DeLiGAN demonstrates superior performance in generating diverse samples compared to baseline GAN models.
Modified Inception Score (m-IS):
- To quantify the diversity and quality of generated samples, the authors introduce a modified inception score, which augments the original inception score to better reflect intra-class diversity.
- The m-IS correlates well with human evaluation and provides a robust metric for assessing the generated samples' variability and fidelity.

Numerical Results and Comparative Analysis

In experiments with toy datasets and real-world scenarios (MNIST, CIFAR-10, TU-Berlin Sketches), DeLiGAN consistently outperforms traditional GANs in generating more diverse and qualitatively superior samples.
The modified inception score analysis reveals that DeLiGAN maintains higher diversity in sample generation without sacrificing image realism, a notable improvement over existing methods.

Implications and Future Directions

The DeLiGAN model demonstrates the potential of enhancing GAN architectures by focusing on the latent space's modeling power. This approach offers significant implications for applications where data is scarce or highly varied, such as in medical imaging or rare object generation.

Future developments could explore more complex mixture models and alternative latent space parameterizations to further improve diversity and stability. Additionally, integrating DeLiGAN's principles into other generative frameworks and testing its scalability on even more extensive datasets would provide valuable insights into its broader applicability.

In conclusion, this paper provides a substantive contribution to the field of generative modeling, offering a viable solution to a common data scarcity problem in machine learning and opening avenues for further explorations in generative architectures.

PDF Markdown