Feature Quantization Improves GAN Training (2004.02088v2)

Published 5 Apr 2020 in cs.LG, cs.CV, and stat.ML

Abstract: The instability in GAN training has been a long-standing problem despite remarkable research efforts. We identify that instability issues stem from difficulties of performing feature matching with mini-batch statistics, due to a fragile balance between the fixed target distribution and the progressively generated distribution. In this work, we propose Feature Quantization (FQ) for the discriminator, to embed both true and fake data samples into a shared discrete space. The quantized values of FQ are constructed as an evolving dictionary, which is consistent with feature statistics of the recent distribution history. Hence, FQ implicitly enables robust feature matching in a compact space. Our method can be easily plugged into existing GAN models, with little computational overhead in training. We apply FQ to 3 representative GAN models on 9 benchmarks: BigGAN for image generation, StyleGAN for face synthesis, and U-GAT-IT for unsupervised image-to-image translation. Extensive experimental results show that the proposed FQ-GAN can improve the FID scores of baseline methods by a large margin on a variety of tasks, achieving new state-of-the-art performance.

Citations (46)

View on Semantic Scholar

Summary

The paper introduces feature quantization as a novel approach to stabilize GAN training by embedding data samples into a shared discrete space.
The paper demonstrates that integrating FQ improves convergence, enhances FID scores, and reduces mode collapse in various GAN frameworks.
The paper suggests that FQ’s simplicity and efficacy open new avenues for research in high-resolution image synthesis and other adversarial models.

Analyzing Feature Quantization to Enhance GAN Training

The paper "Feature Quantization Improves GAN Training" presents a novel approach to tackling the instability issues associated with training Generative Adversarial Networks (GANs). The authors propose Feature Quantization (FQ) as a method to improve the training of GANs by embedding both real and fake data samples into a shared discrete space within the discriminator. This technique is integrated into existing GAN architectures with minimal computational overhead, evidenced by significant improvements in FID scores across multiple benchmarks.

Key Contributions

The paper's primary contribution is the introduction of FQ, which quantizes feature representations in the discriminator to facilitate more stable and effective training of GANs. This is achieved by quantizing continuous feature spaces into discrete representations or dictionaries, which evolve dynamically as training progresses. The quantization ensures that features of both real and generated samples are matched more accurately, thus addressing the long-standing problem of instability in GAN training.

Theoretical and Experimental Insights

The authors anchor their approach on the observation that traditional methods relying on mini-batch statistics for feature matching are fraught with inaccuracies. This results stemming from the non-stationary nature of GANs where the distribution of generated samples evolves over time. By adopting FQ, these issues are mitigated as feature matching is conducted within a quantized space that reflects recent distribution characteristics. The theoretical underpinning of this method rests on the stability gained by discretizing the feature space, leading to a more robust training environment.

Empirically, the authors demonstrate the efficacy of FQ in several GAN frameworks, including BigGAN and StyleGAN, across datasets such as CIFAR-10, CIFAR-100, and ImageNet. The improvements are reflected in better FID scores, which are a popular measure of GAN performance, reflecting both fidelity and diversity of the generated samples. Additionally, the FQ-GAN models show enhanced convergence rates and reduced susceptibility to mode collapse.

Implications and Future Directions

The introduction of FQ opens several avenues for future research in generative modeling. First, it presents a compelling case for revisiting the feature representation studies within discriminative models of GANs, emphasizing the potential of discrete structures. Practically, this method can be explored in conjunction with other GAN variants aimed at high-resolution image synthesis or domain adaptation tasks.

Moreover, the methodological simplicity of FQ suggests its potential application in broader contexts beyond GANs, such as reinforcement learning or other adversarial frameworks that suffer from similar instabilities. Future work could further optimize the dictionary update mechanism or explore adaptive methods to determine the quantization levels dynamically.

Conclusion

In summary, the paper effectively demonstrates how feature quantization can play a pivotal role in stabilizing and improving GAN training. By leveraging a dynamic and consistent feature-matching strategy, FQ-GAN advances the state of the art in generative modeling. This paper serves as a robust groundwork for future explorations into discretizing feature spaces to enhance model training stability and performance in high-dimensional spaces.