Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Deep Generative Sampling in the Dual Divergence Space: A Data-efficient & Interpretative Approach for Generative AI (2404.07377v1)

Published 10 Apr 2024 in cs.LG, cs.AI, cs.CL, cs.CV, cs.IT, and math.IT

Abstract: Building on the remarkable achievements in generative sampling of natural images, we propose an innovative challenge, potentially overly ambitious, which involves generating samples of entire multivariate time series that resemble images. However, the statistical challenge lies in the small sample size, sometimes consisting of a few hundred subjects. This issue is especially problematic for deep generative models that follow the conventional approach of generating samples from a canonical distribution and then decoding or denoising them to match the true data distribution. In contrast, our method is grounded in information theory and aims to implicitly characterize the distribution of images, particularly the (global and local) dependency structure between pixels. We achieve this by empirically estimating its KL-divergence in the dual form with respect to the respective marginal distribution. This enables us to perform generative sampling directly in the optimized 1-D dual divergence space. Specifically, in the dual space, training samples representing the data distribution are embedded in the form of various clusters between two end points. In theory, any sample embedded between those two end points is in-distribution w.r.t. the data distribution. Our key idea for generating novel samples of images is to interpolate between the clusters via a walk as per gradients of the dual function w.r.t. the data dimensions. In addition to the data efficiency gained from direct sampling, we propose an algorithm that offers a significant reduction in sample complexity for estimating the divergence of the data distribution with respect to the marginal distribution. We provide strong theoretical guarantees along with an extensive empirical evaluation using many real-world datasets from diverse domains, establishing the superiority of our approach w.r.t. state-of-the-art deep learning methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. Wasserstein generative adversarial networks. In International conference on machine learning, pages 214–223. PMLR, 2017.
  2. Mutual information neural estimation. In International Conference on Machine Learning, 2018.
  3. Michael Breakspear. Dynamic models of large-scale brain activity. Nature neuroscience, 20(3):340–352, 2017.
  4. Continuous-time flows for efficient inference and density estimation. In International Conference on Machine Learning, pages 824–833. PMLR, 2018.
  5. Asymptotic evaluation of certain markov process expectations for large time. iv. Communications on pure and applied mathematics, 36(2):183–212, 1983.
  6. An image is worth 16x16 words: Transformers for image recognition at scale. ICLR, 2020.
  7. Denoising diffusion probabilistic models for generation of realistic fully-annotated microscopy image data sets. arXiv preprint arXiv:2301.10227, 2023.
  8. Information theoretic clustering via divergence maximization among clusters. In Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, 2023.
  9. Primal-dual wasserstein gan. arXiv preprint arXiv:1805.09575, 2018.
  10. Generative adversarial nets. In Advances in neural information processing systems, 2014.
  11. Improved training of wasserstein gans. Advances in neural information processing systems, 30, 2017.
  12. Kernel density estimation for time series data. International journal of forecasting, 28(1):3–14, 2012.
  13. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
  14. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
  15. Denoising diffusion probabilistic models for 3d medical image generation. Scientific Reports, 13(1):7303, 2023.
  16. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  17. Benoit B Mandelbrot. The variation of certain speculative prices. In Fractals and scaling in finance, pages 371–418. Springer, 1997.
  18. Improved denoising diffusion probabilistic models. In International Conference on Machine Learning, pages 8162–8171. PMLR, 2021.
  19. f-gans in an information geometric nutshell. Advances in Neural Information Processing Systems, 30, 2017.
  20. f-gan: Training generative neural samplers using variational divergence minimization. Advances in neural information processing systems, 29, 2016.
  21. Masked autoregressive flow for density estimation. Advances in neural information processing systems, 30, 2017.
  22. The heavy tail of the human brain. Current opinion in neurobiology, 31:164–172, 2015.
  23. Survey of spiking in the mouse visual system reveals functional hierarchy. Nature, 592(7852):86–92, 2021.
  24. Understanding the limitations of variational mutual information estimators. arXiv preprint arXiv:1910.06222, 2020.
  25. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020a.
  26. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020b.
  27. Teruko Takada et al. Nonparametric density estimation: A comparative study. Economics Bulletin, 3(16):1–10, 2001.
  28. Resnet in resnet: Generalizing residual architectures. ICLR, 2016.
  29. Satosi Watanabe. Information theoretical analysis of multivariate correlation. IBM Journal of research and development, 4(1):66–82, 1960.
  30. Nested variational inference. Advances in Neural Information Processing Systems, 34:20423–20435, 2021.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com