Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Wasserstein perspective of Vanilla GANs (2403.15312v2)

Published 22 Mar 2024 in math.ST, cs.LG, stat.ML, and stat.TH

Abstract: The empirical success of Generative Adversarial Networks (GANs) caused an increasing interest in theoretical research. The statistical literature is mainly focused on Wasserstein GANs and generalizations thereof, which especially allow for good dimension reduction properties. Statistical results for Vanilla GANs, the original optimization problem, are still rather limited and require assumptions such as smooth activation functions and equal dimensions of the latent space and the ambient space. To bridge this gap, we draw a connection from Vanilla GANs to the Wasserstein distance. By doing so, existing results for Wasserstein GANs can be extended to Vanilla GANs. In particular, we obtain an oracle inequality for Vanilla GANs in Wasserstein distance. The assumptions of this oracle inequality are designed to be satisfied by network architectures commonly used in practice, such as feedforward ReLU networks. By providing a quantitative result for the approximation of a Lipschitz function by a feedforward ReLU network with bounded H\"older norm, we conclude a rate of convergence for Vanilla GANs as well as Wasserstein GANs as estimators of the unknown probability distribution.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Aggarwal, C. (2018). Neural Networks and Deep Learning: A Textbook. Springer Cham.
  2. Sorting out Lipschitz function approximation. In International Conference on Machine Learning (pp. 291–301).: PMLR.
  3. Towards principled methods for training generative adversarial networks.
  4. Wasserstein generative adversarial networks. In International conference on machine learning (pp. 214–223).: PMLR.
  5. Euler-lagrange analysis of generative adversarial networks. Journal of Machine Learning Research, 24(126), 1–100.
  6. Rates of convergence for density estimation with GANs. arXiv preprint arXiv:2102.00199.
  7. Simultaneous approximation of a smooth function and its derivatives by deep neural networks with piecewise-polynomial activations. Neural Networks, 161, 242–253.
  8. Some theoretical properties of GANs. arXiv preprint arXiv: 1803.07819.
  9. Some theoretical insights into wasserstein GANs. Journal of Machine Learning Research, 22(119), 1–45.
  10. The Mathematical Theory of Finite Element Methods, volume 15 of Texts in Applied Mathematics. Springer.
  11. Chae, M. (2022). Rates of convergence for nonparametric estimation of singular distributions using generative adversarial networks. arXiv preprint arXiv:2202.02890.
  12. On the statistical properties of generative adversarial models for low intrinsic data dimension.
  13. Distribution approximation and statistical estimation guarantees of generative adversarial networks. arXiv preprint arXiv:2002.03938.
  14. Norm-preserving orthogonal permutation linear unit activation functions (oplu). arXiv preprint arXiv:1604.02313.
  15. Neural network approximation. Acta Numerica, 30, 327 – 444.
  16. Dudley, R. M. (1969). The speed of mean Glivenko-Cantelli convergence. The Annals of Mathematical Statistics, 40(1), 40 – 50.
  17. Eckstein, S. (2020). Lipschitz neural networks are dense in the set of all Lipschitz functions. arXiv preprint arXiv:2009.13881.
  18. A convex duality framework for GANs. Advances in neural information processing systems, 31.
  19. On choosing and bounding probability metrics. International Statistical Review / Revue Internationale de Statistique, 70(3), 419–435.
  20. Generative adversarial networks. arXiv preprint arXiv: 1406.2661.
  21. Error bounds for approximations with deep ReLU neural networks in Ws,psuperscript𝑊𝑠𝑝W^{s,p}italic_W start_POSTSUPERSCRIPT italic_s , italic_p end_POSTSUPERSCRIPT norms. Analysis and Applications, 18(05), 803–859.
  22. Improved training of Wasserstein GANs. Advances in neural information processing systems, 30.
  23. An error analysis of generative adversarial networks for learning distributions. Journal of machine learning research, 23(116), 1–43.
  24. Limitations of the Lipschitz constant as a defense against adversarial examples. In ECML PKDD 2018 Workshops: Nemesis 2018, UrbReas 2018, SoGood 2018, IWAISe 2018, and Green Data Mining 2018, Dublin, Ireland, September 10-14, 2018, Proceedings 18 (pp. 16–29).: Springer.
  25. Some fundamental aspects about Lipschitz continuity of neural network functions. arXiv preprint arXiv:2302.10886.
  26. Liang, T. (2017). How well can generative adversarial networks learn densities: A nonparametric view. arXiv preprint arXiv:1712.08244.
  27. Liang, T. (2021). How well generative adversarial networks learn distributions. The Journal of Machine Learning Research, 22(1), 10366–10406.
  28. Lunardi, A. (2018). Interpolation theory.
  29. Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957.
  30. Mueller, A. (1997). Integral probability metrics and their generating classes of functions. Advances in Applied Probability, 29(2), 429–443.
  31. f-gan: Training generative neural samplers using variational divergence minimization. Advances in neural information processing systems, 29.
  32. On the regularization of Wasserstein GANs. arXiv preprint arXiv:1709.08894.
  33. Schreuder, N. (2020). Bounding the expectation of the supremum of empirical processes indexed by Hölder classes. Mathematical Methods of Statistics, 29(1), 76–86.
  34. Statistical guarantees for generative models without domination. Algorithmic Learning Theory, (pp. 1051–1071).
  35. Stein, E. M. (1970). Singular Integrals and Differentiability Properties of Functions (PMS-30). Princeton University Press.
  36. Wasserstein GANs are minimax optimal distribution estimators. arXiv preprint arXiv:2311.18613.
  37. A survey on statistical theory of deep learning: Approximation, training dynamics, and generative models. arXiv preprint arXiv:2401.07187.
  38. Minimax rate of distribution estimation on unknown submanifold under adversarial losses. arXiv preprint arXiv:2202.09030.
  39. Generalization of GANs and overparameterized models under Lipschitz continuity. arXiv preprint arXiv:2104.02388.
  40. A survey on optimal transport for machine learning: Theory and applications. arXiv preprint arXiv:2106.01963.
  41. Guaranteed optimal generative modeling with maximum deviation from the empirical distribution. arXiv preprint arXiv:2307.16422.
  42. Villani, C. (2008). Optimal Transport: Old and New. Grundlehren der mathematischen Wissenschaften. Springer Berlin Heidelberg.
  43. Improving the improved training of Wasserstein GANs. In International Conference on Learning Representations.
  44. Yarotsky, D. (2017). Error bounds for approximations with deep ReLU networks. Neural Networks, 94, 103–114.
  45. Lipschitz generative adversarial nets. In International Conference on Machine Learning (pp. 7584–7593).: PMLR.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Lea Kunkel (2 papers)
  2. Mathias Trabs (30 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com