SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer (2301.12811v4)
Abstract: Generative adversarial networks (GANs) learn a target probability distribution by optimizing a generator and a discriminator with minimax objectives. This paper addresses the question of whether such optimization actually provides the generator with gradients that make its distribution close to the target distribution. We derive metrizable conditions, sufficient conditions for the discriminator to serve as the distance between the distributions by connecting the GAN formulation with the concept of sliced optimal transport. Furthermore, by leveraging these theoretical results, we propose a novel GAN training scheme, called slicing adversarial network (SAN). With only simple modifications, a broad class of existing GANs can be converted to SANs. Experiments on synthetic and image datasets support our theoretical results and the SAN's effectiveness as compared to usual GANs. Furthermore, we also apply SAN to StyleGAN-XL, which leads to state-of-the-art FID score amongst GANs for class conditional generation on ImageNet 256$\times$256. Our implementation is available on https://ytakida.github.io/san.
- Wasserstein GAN. arXiv preprint arXiv:1701.07875, 2017.
- Generalization and equilibrium in generative adversarial nets (GANs). In Proc. International Conference on Machine Learning (ICML), volume 70, pp. 224–232, 2017.
- Invertible residual networks. In Proc. International Conference on Machine Learning (ICML), pp. 573–582, 2019.
- A closer look at the optimization landscapes of generative adversarial networks. In Proc. International Conference on Learning Representation (ICLR), 2020.
- Spherical sliced-wasserstein. In Proc. International Conference on Learning Representation (ICLR), 2023.
- Sliced and radon Wasserstein barycenters of measures. Journal of Mathematical Imaging and Vision, 51(1):22–45, 2015.
- Large scale gan training for high fidelity natural image synthesis. In Proc. International Conference on Learning Representation (ICLR), 2019.
- Lu-net: Invertible neural networks based on matrix factorization. arXiv preprint arXiv:2302.10524, 2023.
- Reducing noise in gan training with variance reduced extragradient. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 32, 2019.
- Taming gans with lookahead-minmax. In Proc. International Conference on Learning Representation (ICLR), 2022.
- Augmented sliced wasserstein distances. In Proc. International Conference on Learning Representation (ICLR), 2022.
- Smoothness and stability in GANs. In Proc. International Conference on Learning Representation (ICLR), 2020.
- Max-sliced Wasserstein distance and its use for GANs. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10648–10656, 2019.
- Adversarial audio synthesis. In Proc. International Conference on Learning Representation (ICLR), 2019.
- Variational wasserstein gradient flow. In Proc. International Conference on Machine Learning (ICML), volume 162, pp. 6185–6215, 2022.
- Do GANs always have nash equilibria? In Proc. International Conference on Machine Learning (ICML), pp. 3029–3039, 2020.
- Minimax optimization with smooth algorithmic adversaries. In Proc. International Conference on Learning Representation (ICLR), 2022.
- Deep generative learning via variational gradient flow. In Proc. International Conference on Machine Learning (ICML), pp. 2093–2101, 2019.
- Generative adversarial nets. In Proc. Advances in Neural Information Processing Systems (NeurIPS), pp. 2672–2680, 2014.
- Improved training of Wasserstein GANs. In Proc. Advances in Neural Information Processing Systems (NeurIPS), pp. 5767–5777, 2017.
- GANcraft: Unsupervised 3d neural rendering of minecraft worlds. In Proc. IEEE/CVF International Conference on Computer Vision (ICCV), pp. 14072–14082, 2021.
- Sigurdur Helgason. The Radon transform on Rn. In Integral Geometry and Radon Transforms, pp. 1–62. Springer, 2011.
- beta-VAE: Learning basic visual concepts with a constrained variational framework. In Proc. International Conference on Learning Representation (ICLR), 2017.
- Denoising diffusion probabilistic models. In Proc. Advances in Neural Information Processing Systems (NeurIPS), pp. 6840–6851, 2020.
- Finding mixed nash equilibria of generative adversarial networks. In Proc. International Conference on Machine Learning (ICML), pp. 2810–2819, 2019.
- What is local optimality in nonconvex-nonconcave minimax optimization? In Proc. International Conference on Machine Learning (ICML), volume 119, pp. 4880–4889, 2020.
- A framework of composite functional gradient methods for generative adversarial models. IEEE transactions on pattern analysis and machine intelligence, 43(1):17–32, 2019.
- An introduction to variational methods for graphical models. Machine Learning, 37(2):183–233, 1999.
- Invertible convolutional flow. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 32, 2019.
- Progressive growing of gans for improved quality, stability, and variation. In Proc. International Conference on Learning Representation (ICLR), 2018.
- A style-based generator architecture for generative adversarial networks. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4401–4410, 2019.
- Analyzing and improving the image quality of stylegan. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8110–8119, 2020.
- Alias-free generative adversarial networks. In Proc. Advances in Neural Information Processing Systems (NeurIPS), 2021.
- Adam: A method for stochastic optimization. In Proc. International Conference on Learning Representation (ICLR), 2015.
- Auto-encoding variational Bayes. In Proc. International Conference on Learning Representation (ICLR), 2014.
- Generalized sliced Wasserstein distances. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 32, 2019.
- HiFi-gan: Generative adversarial networks for efficient and high fidelity speech synthesis. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 33, pp. 17022–17033, 2020.
- Learning multiple layers of features from tiny images. 2009.
- MelGAN: Generative adversarial networks for conditional waveform synthesis. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 32, 2019.
- The role of imagenet classes in fréchet inception distance. In Proc. International Conference on Learning Representation (ICLR), 2023.
- Gradient-based learning applied to document recognition. Proc. IEEE, 86(11):2278–2324, 1998.
- Run-sort-rerun: Escaping batch size limitations in sliced wasserstein generative models. In Proc. International Conference on Machine Learning (ICML), volume 139, pp. 6275–6285, 2021.
- On the limitations of first-order approximation in gan dynamics. In Proc. International Conference on Machine Learning (ICML), pp. 3005–3013, 2018.
- Geometric gan. arXiv preprint arXiv:1705.02894, 2017.
- Why spectral normalization stabilizes GANs: Analysis and improvements. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 34, pp. 9625–9638, 2021.
- Deep learning face attributes in the wild. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3730–3738, 2015.
- The numerics of gans. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 30, 2017.
- Which training methods for GANs do actually converge? In Proc. International Conference on Machine Learning (ICML), pp. 3481–3490, 2018.
- Envelope theorems for arbitrary choice sets. Econometrica, 70(2):583–601, 2002.
- cgans with projection discriminator. In Proc. International Conference on Learning Representation (ICLR), 2018.
- Spectral normalization for generative adversarial networks. In Proc. International Conference on Learning Representation (ICLR), 2018.
- Alfred Müller. Integral probability metrics and their generating classes of functions. Advances in Applied Probability, 29(2):429–443, 1997.
- Gradient descent GAN optimization is locally stable. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 30, 2017.
- Frank Natterer. The mathematics of computerized tomography. SIAM, 2001.
- Revisiting sliced wasserstein on images: From vectorization to convolution. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 35, pp. 17788–17801, 2022.
- Distributional sliced-Wasserstein and applications to generative modeling. In Proc. International Conference on Learning Representation (ICLR), 2021.
- Improved denoising diffusion probabilistic models. In Proc. International Conference on Machine Learning (ICML), pp. 8162–8171. PMLR, 2021.
- Gradient layer: Enhancing the convergence of adversarial training for generative models. In Proc. International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 1008–1016, 2018.
- f-GAN: Training generative neural samplers using variational divergence minimization. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 29, 2016.
- Globally injective relu networks. The Journal of Machine Learning Research, 23(1):4544–4598, 2022.
- Training generative adversarial networks by solving ordinary differential equations. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 33, pp. 5599–5609, 2020.
- Unsupervised representation learning with deep convolutional generative adversarial networks. In Proc. International Conference on Learning Representation (ICLR), 2016.
- Learning transferable visual models from natural language supervision. In Proc. International Conference on Machine Learning (ICML), pp. 8748–8763, 2021.
- Characterization and computation of local nash equilibria in continuous games. In Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 917–924. IEEE, 2013.
- Variational inference with normalizing flows. In Proc. International Conference on Machine Learning (ICML), pp. 1530–1538, 2015.
- Imagenet large scale visual recognition challenge. International journal of computer vision, 115:211–252, 2015.
- Improved techniques for training GANs. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 29, 2016.
- On the convergence and robustness of training gans with regularized optimal transport. In Proc. International Conference on Machine Learning (ICML), volume 31, 2018.
- Projected gans converge faster. In Proc. Advances in Neural Information Processing Systems (NeurIPS), pp. 17480–17492, 2021.
- Stylegan-xl: Scaling stylegan to large diverse datasets. In ACM SIGGRAPH 2022 conference proceedings, pp. 1–10, 2022.
- Stylegan-t: Unlocking the power of gans for fast large-scale text-to-image synthesis. In Proc. International Conference on Machine Learning (ICML), 2023.
- Top-k training of GANs: Improving generators by making critics less critical. In Proc. Advances in Neural Information Processing Systems (NeurIPS), 2020.
- Bridging the gap between f-gans and wasserstein gans. In Proc. International Conference on Machine Learning (ICML), pp. 9078–9087, 2020.
- Denoising diffusion implicit models. In Proc. International Conference on Learning Representation (ICLR), 2020.
- Mintnet: Building invertible neural networks with masked convolutions. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 32, 2019.
- VEEGAN: Reducing mode collapse in gans using implicit variational learning. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 30, 2017.
- Rethinking the inception architecture for computer vision. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826, 2016.
- A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics, 66(2):145–164, 2013.
- Density estimation by dual ascent of the log-likelihood. Communications in Mathematical Sciences, 8(1):217–233, 2010.
- SQ-VAE: Variational Bayes on discrete representation with self-annealed stochastic quantization. In Proc. International Conference on Machine Learning (ICML), 2022.
- EfficientNet: Rethinking model scaling for convolutional neural networks. In Proc. International Conference on Machine Learning (ICML), pp. 6105–6114, 2019.
- Dávid Terjék. Adversarial lipschitz regularization. In Proc. International Conference on Learning Representation (ICLR), 2020.
- Training data-efficient image transformers & distillation through attention. In Proc. International Conference on Machine Learning (ICML), pp. 10347–10357, 2021.
- MoCoGAN: Decomposing motion and content for video generation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1526–1535, 2018.
- Cédric Villani. Optimal transport: old and new, volume 338. Springer, 2009.
- Understanding and stabilizing GANs’ training dynamics with control theory. In Proc. International Conference on Machine Learning (ICML), pp. 10566–10575, 2020.
- The unusual effectiveness of averaging in gan training. In Proc. International Conference on Learning Representation (ICLR), 2019.
- Self-attention generative adversarial networks. In Proc. International Conference on Machine Learning (ICML), pp. 7354–7363, 2019.
- Consistency regularization for generative adversarial networks. In Proc. International Conference on Learning Representation (ICLR), 2020.
- InfoVAE: Balancing learning and inference in variational autoencoders. In Proc. AAAI Conference on Artificial Intelligence (AAAI), pp. 5885–5892, 2019.