Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Efficient Generative Modeling via Penalized Optimal Transport Network (2402.10456v2)

Published 16 Feb 2024 in stat.ML, cs.LG, stat.AP, and stat.ME

Abstract: The generation of synthetic data with distributions that faithfully emulate the underlying data-generating mechanism holds paramount significance. Wasserstein Generative Adversarial Networks (WGANs) have emerged as a prominent tool for this task; however, due to the delicate equilibrium of the minimax formulation and the instability of Wasserstein distance in high dimensions, WGAN often manifests the pathological phenomenon of mode collapse. This results in generated samples that converge to a restricted set of outputs and fail to adequately capture the tail behaviors of the true distribution. Such limitations can lead to serious downstream consequences. To this end, we propose the Penalized Optimal Transport Network (POTNet), a versatile deep generative model based on the marginally-penalized Wasserstein (MPW) distance. Through the MPW distance, POTNet effectively leverages low-dimensional marginal information to guide the overall alignment of joint distributions. Furthermore, our primal-based framework enables direct evaluation of the MPW distance, thus eliminating the need for a critic network. This formulation circumvents training instabilities inherent in adversarial approaches and avoids the need for extensive parameter tuning. We derive a non-asymptotic bound on the generalization error of the MPW loss and establish convergence rates of the generative distribution learned by POTNet. Our theoretical analysis together with extensive empirical evaluations demonstrate the superior performance of POTNet in accurately capturing underlying data structures, including their tail behaviors and minor modalities. Moreover, our model achieves orders of magnitude speedup during the sampling stage compared to state-of-the-art alternatives, which enables computationally efficient large-scale synthetic data generation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. M. Arjovsky and L. Bottou. Towards principled methods for training generative adversarial networks. arXiv preprint arXiv:1701.04862, 2017.
  2. Wasserstein generative adversarial networks. In International conference on machine learning, pages 214–223. PMLR, 2017.
  3. Generalization and equilibrium in generative adversarial nets (gans). In International conference on machine learning, pages 224–232. PMLR, 2017.
  4. M. A. Beaumont. Approximate Bayesian computation in evolution and ecology. Annual review of ecology, evolution, and systematics, 41:379–406, 2010.
  5. Displacement interpolation using Lagrangian mass transport. In Proceedings of the 2011 SIGGRAPH Asia conference, pages 1–12, 2011.
  6. Deep neural networks and tabular data: A survey. IEEE Transactions on Neural Networks and Learning Systems, 2022a.
  7. Language models are realistic tabular data generators. arXiv preprint arXiv:2210.06280, 2022b.
  8. Boosting deep learning risk prediction with generative adversarial networks for electronic health records. In 2017 IEEE International Conference on Data Mining (ICDM), pages 787–792. IEEE, 2017.
  9. Boosting synthetic data generation with effective nonlinear causal discovery. In 2021 IEEE Third International Conference on Cognitive Machine Intelligence (CogMI), pages 54–63. IEEE, 2021.
  10. J. Dahmen and D. Cook. Synsys: A synthetic data generation system for healthcare applications. Sensors, 19(5):1181, 2019.
  11. Generative modeling using the sliced Wasserstein distance. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3483–3491, 2018.
  12. R. M. Dudley. The speed of mean Glivenko-Cantelli convergence. The Annals of Mathematical Statistics, 40(1):40–50, 1969.
  13. POT: Python Optimal Transport. Journal of Machine Learning Research, 22(78):1–8, 2021. URL http://jmlr.org/papers/v22/20-451.html.
  14. J. Fonseca and F. Bacao. Tabular and latent space synthetic data generation: a literature review. Journal of Big Data, 10(1):115, 2023.
  15. N. Fournier and A. Guillin. On the rate of convergence in Wasserstein distance of the empirical measure. Probability theory and related fields, 162(3-4):707–738, 2015.
  16. A review of challenges and opportunities in machine learning for health. AMIA Summits on Translational Science Proceedings, 2020:191, 2020.
  17. Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
  18. Revisiting deep learning models for tabular data. Advances in Neural Information Processing Systems, 34:18932–18943, 2021.
  19. A kernel two-sample test. The Journal of Machine Learning Research, 13(1):723–773, 2012.
  20. Improved training of Wasserstein GANs. Advances in neural information processing systems, 30, 2017.
  21. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144, 2016.
  22. Heart Disease. UCI Machine Learning Repository, 1988. DOI: https://doi.org/10.24432/C52P4X.
  23. N. Jeffrey and B. D. Wandelt. Solving high-dimensional parameter inference: marginal posterior densities & Moment Networks. arXiv preprint arXiv:2011.05991, 2020.
  24. Learning summary statistic for approximate Bayesian computation via deep neural network. Statistica Sinica, pages 1595–1618, 2017.
  25. Copula flows for synthetic data generation. arXiv preprint arXiv:2101.00598, 2021.
  26. L. V. Kantorovich. On the translocation of masses. Journal of mathematical sciences, 133(4):1381–1382, 2006.
  27. Kantorovich strikes back! Wasserstein GANs are not optimal transport? Advances in Neural Information Processing Systems, 35:13933–13946, 2022.
  28. H. W. Kuhn. The Hungarian method for the assignment problem. Naval research logistics quarterly, 2(1-2):83–97, 1955.
  29. I. Loshchilov and F. Hutter. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
  30. Multivariate density estimation by Bayesian sequential partitioning. Journal of the American Statistical Association, 108(504):1402–1410, 2013.
  31. Examining generative adversarial network for smart home ddos traffic generation. In 2023 International Symposium on Networks, Computers and Communications (ISNCC), pages 1–6. IEEE, 2023.
  32. Approximate Bayesian computation and Bayes’ linear analysis: toward high-dimensional ABC. Journal of Computational and Graphical Statistics, 23(1):65–86, 2014.
  33. R. K. Pace and R. Barry. Sparse spatial autoregressions. Statistics & Probability Letters, 33(3):291–297, 1997.
  34. Sequential neural likelihood: Fast likelihood-free inference with autoregressive flows. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 837–848. PMLR, 2019.
  35. Wasserstein barycenter and its application to texture mixing. In Scale Space and Variational Methods in Computer Vision: Third International Conference, SSVM 2011, Ein-Gedi, Israel, May 29–June 2, 2011, Revised Selected Papers 3, pages 435–446. Springer, 2012.
  36. Improved techniques for training gans. Advances in neural information processing systems, 29, 2016.
  37. Can push-forward generative models fit multimodal distributions? Advances in Neural Information Processing Systems, 35:10766–10779, 2022.
  38. R. Shwartz-Ziv and A. Armon. Tabular data: Deep learning is not all you need. Information Fusion, 81:84–90, 2022.
  39. C. Villani et al. Optimal transport: old and new, volume 338. Springer, 2009.
  40. Y. Wang and V. Ročková. Adversarial Bayesian simulation. arXiv preprint arXiv:2208.12113, 2022.
  41. J. Weed and F. Bach. Sharp asymptotic and finite-sample rates of convergence of empirical measures in Wasserstein distance. Bernoulli, 25(4A):2620–2648, 2019. ISSN 1350-7265,1573-9759. doi: 10.3150/18-BEJ1065. URL https://doi.org/10.3150/18-BEJ1065.
  42. Sliced Wasserstein generative models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3713–3722, 2019.
  43. Modeling tabular data using conditional gan. Advances in neural information processing systems, 32, 2019.
  44. Shifting machine learning for healthcare from development to deployment and from models to data. Nature Biomedical Engineering, 6(12):1330–1345, 2022.
  45. M. Zwitter and M. Soklic. Breast Cancer. UCI Machine Learning Repository, 1988. DOI: https://doi.org/10.24432/C51P4M.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com