Freeze the Discriminator: a Simple Baseline for Fine-Tuning GANs

Published 25 Feb 2020 in cs.CV, cs.LG, and stat.ML | (2002.10964v2)

Abstract: Generative adversarial networks (GANs) have shown outstanding performance on a wide range of problems in computer vision, graphics, and machine learning, but often require numerous training data and heavy computational resources. To tackle this issue, several methods introduce a transfer learning technique in GAN training. They, however, are either prone to overfitting or limited to learning small distribution shifts. In this paper, we show that simple fine-tuning of GANs with frozen lower layers of the discriminator performs surprisingly well. This simple baseline, FreezeD, significantly outperforms previous techniques used in both unconditional and conditional GANs. We demonstrate the consistent effect using StyleGAN and SNGAN-projection architectures on several datasets of Animal Face, Anime Face, Oxford Flower, CUB-200-2011, and Caltech-256 datasets. The code and results are available at https://github.com/sangwoomo/FreezeD.

Abstract PDF Upgrade to Chat

Citations (204)

View on Semantic Scholar

Summary

The paper introduces FreezeD, a method that freezes the lower layers of the discriminator to enhance transfer learning in GANs.
It leverages the generic feature extraction in early layers to mitigate overfitting and reduce computational resource demands.
Experiments show FreezeD’s superior stability and improved FID scores across various GAN architectures and challenging datasets.

Freeze the Discriminator: A Simple Baseline for Fine-Tuning GANs

The paper "Freeze the Discriminator: a Simple Baseline for Fine-Tuning GANs" presents an innovative yet straightforward approach to enhance the transfer learning capabilities of Generative Adversarial Networks (GANs). This work explores the challenges of resource-intensive GAN training processes, especially when faced with limited data. It suggests an alternative method for fine-tuning GANs that shows significant promise in outperforming current techniques.

Generative adversarial networks (GANs), since their inception by Goodfellow et al., have been pivotal in numerous domains such as computer vision and graphics. However, their practical deployment often faces hurdles due to the excessive demand for large datasets and computational resources. The paper addresses these concerns by leveraging transfer learning, a strategy that has significantly advanced deep learning methodologies.

Contributions and Methodology

The primary contribution of the paper is the introduction of a simplistic yet effective baseline termed FreezeD. It operates by freezing the lower layers of the discriminator during fine-tuning. This setup capitalizes on the observation that the lower layers capture generic image features while the upper layers focus on the real vs. fake classification task. This concept, although established in classifier training, has not been widely utilized in GANs previously. Through extensive experiments, FreezeD is shown to substantially enhance stability and performance over existing methods.

Key existing techniques are acknowledged including:

Fine-tuning: Though effective, it often risks overfitting, particularly in cases with limited data.
Scale/Shift: While it attempts to mitigate overfitting by only updating normalization layers, it can be restrictive, particularly in cases with large distribution shifts.
Generative Latent Optimization (GLO): This introduces a supervised loss component but risks producing blurry images due to the absence of adversarial loss.
MineGAN: It focuses on adapting the latent representations but struggles when source and target distributions have minimal overlap.

FreezeD promises simplicity and efficiency. It suggests a potential pathway for designing robust and transfer-effective GANs without complex architectures or training regimes.

Experimental Evaluation

The paper's experiments encompass a range of GAN architectures including StyleGAN and SNGAN-projection, applied to datasets like Animal Face, Anime Face, Oxford Flower, and others. The results underline the stability and superior FID scores attainable through FreezeD. Notably, significant improvements are observed in challenging conditions with small datasets and substantial distribution shifts.

Unconditional GANs: On the Animal and Anime Face datasets, FreezeD consistently outperforms standard fine-tuning, enhancing both performance and stability.
Conditional GANs: Applied to Oxford Flower, CUB-200-2011, and Caltech-256, FreezeD demonstrates its robust adaptability across diverse domain-specific challenges.

One interesting outcome is the ablation study on the number of frozen layers, highlighting the nuanced balance between flexibility and stability in fine-tuning GANs.

Implications and Future Directions

FreezeD establishes a compelling baseline for future research endeavoring to optimize GAN training under resource constraints. Its success advocates for more research into strategies that can appropriately partition and exploit the separable roles within GAN architectures. Furthermore, it opens avenues for studying the broader applicability of discriminator networks, particularly in domains like digital forensics wherein detecting GAN-generated images is crucial.

Future developments might explore integrating advanced feature distillation methods or dynamic layer freezing strategies as potential extensions to the FreezeD approach. These could offer new benchmarks for both generating higher quality images and ensuring better convergence properties across varied tasks.

In summary, this paper successfully presents a methodologically simple yet performance-enhancing approach to GAN transfer learning, encouraging a reevaluation of traditional fine-tuning processes within adversarial networks.

Markdown