SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer (2301.12811v4)

Published 30 Jan 2023 in cs.LG

Abstract: Generative adversarial networks (GANs) learn a target probability distribution by optimizing a generator and a discriminator with minimax objectives. This paper addresses the question of whether such optimization actually provides the generator with gradients that make its distribution close to the target distribution. We derive metrizable conditions, sufficient conditions for the discriminator to serve as the distance between the distributions by connecting the GAN formulation with the concept of sliced optimal transport. Furthermore, by leveraging these theoretical results, we propose a novel GAN training scheme, called slicing adversarial network (SAN). With only simple modifications, a broad class of existing GANs can be converted to SANs. Experiments on synthetic and image datasets support our theoretical results and the SAN's effectiveness as compared to usual GANs. Furthermore, we also apply SAN to StyleGAN-XL, which leads to state-of-the-art FID score amongst GANs for class conditional generation on ImageNet 256$\times$256. Our implementation is available on https://ytakida.github.io/san.

References (90)

Citations (7)

View on Semantic Scholar

Summary

The paper proposes a SAN framework integrating Functional Mean Divergence and sliced optimal transport to enforce metrizable conditions in GANs.
It establishes key properties like injectivity, separability, and direction optimality to improve discriminator reliability and guide effective generator updates.
It demonstrates, through experiments on CIFAR-10, CelebA, and ImageNet, improved FID scores and reduced mode collapse in image synthesis.

Overview of "SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer"

The paper introduces Slicing Adversarial Networks (SAN), a novel GAN training scheme aimed at addressing the theoretical question of whether GAN optimization truly minimizes the dissimilarity between the generator and target distributions. Building on the foundation of Generative Adversarial Networks (GANs), the authors introduce the concept of metrizable conditions, which are necessary for the discriminator to effectively measure the distance between the data and the generator's distribution. These conditions include direction optimality, separability, and injectivity.

Theoretical Contributions

The authors propose a connection between GAN optimization and sliced optimal transport, which helps in deriving metrizable conditions crucial for effective GAN training. By investigating the Functional Mean Divergence (FM ${}^*$ ), they relate it to the concept of sliced optimal transport. This exploration leads to the introduction of the Sanitized GAN training method, which converts existing GANs into SANs through slight modifications.

Functional Mean Divergence (FM ${}^*$ ): The paper defines a generalized notion of divergence that allows for a broader classification encompassing integral probability metrics (IPM), including Wasserstein distances.
Injectivity and Separability: These properties are crucial for ensuring that the discriminator can act as a valid metric between distributions. Injectivity prevents information loss, while separability ensures that the direction chosen by the discriminator maximizes the contrast between the data and the generator distributions.
Direction Optimality: This addresses the problem of the discriminator learning effective gradients for the generator. Under certain conditions, the learned direction of the discriminator optimizes the dissimilarity between two distributions, which is crucial in adversarial training.

Empirical Evaluation

The authors conduct extensive experiments on synthetic data, CIFAR-10, CelebA, and ImageNet, demonstrating that SANs outperform traditional GAN frameworks in terms of Fréchet Inception Distance (FID) scores—a popular measure of image generation quality. Specifically:

SANs showed improved performance in avoiding mode collapse and better coverage of data distribution in a mixture of Gaussians.
In terms of image quality, applying the SAN methodology led to significant improvements over standard GAN approaches, even achieving state-of-the-art FID scores on CIFAR-10 and ImageNet datasets.

Practical Implications and Future Directions

From a practical perspective, SAN allows for more stable and efficient GAN training by ensuring that the discriminator reliably measures the generator's fidelity. This has applications in image synthesis, video, and audio generation, where GANs are widely used.

One of the strengths of the SAN framework is its compatibility with existing GAN architectures, allowing for straightforward implementation adjustments. As AI techniques evolve, it is likely that further insights and improvements will be made in the context of discriminator effectiveness, extending beyond generative modeling to other adversarial learning scenarios.

Future work could explore a more empirical examination of separability and injectivity conditions, as well as explore other domains where adversarial strategies could benefit from this analytical perspective. Moreover, investigating the SAN framework's adaptability to other cutting-edge neural architectures could also prove beneficial.

PDF Markdown

SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer (2301.12811v4)

Summary

Overview of "SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer"

Theoretical Contributions

Empirical Evaluation

Practical Implications and Future Directions

GitHub

Tweets

SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer (2301.12811v4)

Summary

Overview of "SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer"

Theoretical Contributions

Empirical Evaluation

Practical Implications and Future Directions

Related Papers

GitHub

Tweets