GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium (1706.08500v6)

Published 26 Jun 2017 in cs.LG and stat.ML

Abstract: Generative Adversarial Networks (GANs) excel at creating realistic images with complex models for which maximum likelihood is infeasible. However, the convergence of GAN training has still not been proved. We propose a two time-scale update rule (TTUR) for training GANs with stochastic gradient descent on arbitrary GAN loss functions. TTUR has an individual learning rate for both the discriminator and the generator. Using the theory of stochastic approximation, we prove that the TTUR converges under mild assumptions to a stationary local Nash equilibrium. The convergence carries over to the popular Adam optimization, for which we prove that it follows the dynamics of a heavy ball with friction and thus prefers flat minima in the objective landscape. For the evaluation of the performance of GANs at image generation, we introduce the "Fr\'echet Inception Distance" (FID) which captures the similarity of generated images to real ones better than the Inception Score. In experiments, TTUR improves learning for DCGANs and Improved Wasserstein GANs (WGAN-GP) outperforming conventional GAN training on CelebA, CIFAR-10, SVHN, LSUN Bedrooms, and the One Billion Word Benchmark.

Citations (442)

View on Semantic Scholar

Summary

The paper introduces TTUR, a method that assigns distinct learning rates to the generator and discriminator to ensure convergence to a local Nash equilibrium.
It rigorously proves convergence using stochastic approximation theory and analyzes the dynamics of popular optimizers like Adam.
Extensive experiments show that TTUR yields improved image quality with lower Fréchet Inception Distances across key datasets such as CelebA, CIFAR-10, SVHN, and LSUN.

Two Time-Scale GANs: An Overview

The paper "GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium," authored by Martin Heusel and colleagues, presents a nuanced approach to the training of Generative Adversarial Networks (GANs) by employing a Two Time-Scale Update Rule (TTUR). This method differentiates itself by assigning individual learning rates to the discriminator and the generator, tackling the long-standing challenge of convergence in GAN training.

Key Contributions

Two Time-Scale Update Rule (TTUR): The central contribution is the introduction of TTUR, which employs distinct learning rates for the GAN's generator and discriminator. This two-pronged approach is shown to ensure convergence to a local Nash equilibrium, a point where neither component can unilaterally improve its objective within a limited parameter space.
Convergence Proof: Through the theory of stochastic approximation, the paper meticulously proves that TTUR leads to convergence under mild assumptions. This convergence also extends to the widely-used Adam optimizer, which is described in the context of a heavy ball with friction dynamics.
Fréchet Inception Distance (FID): As a novel evaluation metric, FID is introduced to assess image quality more consistently than the Inception Score. FID measures the similarity between generated and real-world images by comparing their statistics, thus providing a robust indicator of model performance.
Experimental Validation: The authors conduct extensive experiments using TTUR on various datasets, including CelebA, CIFAR-10, SVHN, and LSUN Bedrooms. Results demonstrate that models trained with TTUR outperform those trained with conventional methods.

Numerical Insights and Bold Claims

The paper provides strong numerical evidence indicating that TTUR consistently outperforms traditional single time-scale training regimes. TTUR led to lower Fréchet Inception Distances across all tested datasets, indicating that the generated images are closer in style and quality to real images. These experimental results are robust, showing reduced variance and more stable convergence.

Implications and Future Directions

From a practical standpoint, TTUR offers a more reliable training method for GANs, which could significantly enhance applications in image generation and beyond. The ability to train more stable GANs could further lead to advancements in video synthesis, domain adaptation, and even reinforcement learning.

Theoretically, the insights into the role of distinct learning rates could spark similar explorations in other areas of deep learning, possibly influencing training strategies for complex models involving adversarial objectives. Moving forward, this approach might pave the way for adaptive time-scale learning rates that automatically adjust during training, potentially simplifying the tuning process and enhancing model robustness.

In conclusion, the introduction of TTUR and FID marks a significant step in the evolution of GAN training methodologies, presenting both a challenge to existing paradigms and new opportunities for research and application in generative modeling.

PDF Markdown

Related Papers

Tweets

https://twitter.com/Mufei_Li/status/1795622695822078107

YouTube

Show All Videos