Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Latent Schr{ö}dinger Bridge Diffusion Model for Generative Learning (2404.13309v2)

Published 20 Apr 2024 in stat.ML and cs.LG

Abstract: This paper aims to conduct a comprehensive theoretical analysis of current diffusion models. We introduce a novel generative learning methodology utilizing the Schr{\"o}dinger bridge diffusion model in latent space as the framework for theoretical exploration in this domain. Our approach commences with the pre-training of an encoder-decoder architecture using data originating from a distribution that may diverge from the target distribution, thus facilitating the accommodation of a large sample size through the utilization of pre-existing large-scale models. Subsequently, we develop a diffusion model within the latent space utilizing the Schr{\"o}dinger bridge framework. Our theoretical analysis encompasses the establishment of end-to-end error analysis for learning distributions via the latent Schr{\"o}dinger bridge diffusion model. Specifically, we control the second-order Wasserstein distance between the generated distribution and the target distribution. Furthermore, our obtained convergence rates effectively mitigate the curse of dimensionality, offering robust theoretical support for prevailing diffusion models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (63)
  1. Structured denoising diffusion models in discrete state-spaces. Advances in Neural Information Processing Systems, 34:17981–17993, 2021.
  2. Linear convergence bounds for diffusion models via stochastic localization. arXiv preprint arXiv:2308.03686, 2023.
  3. The probability flow ode is provably fast. arXiv preprint arXiv:2305.11798, 2023.
  4. Restoration-degradation beyond linear diffusions: A non-asymptotic analysis for ddim-type samplers. In International Conference on Machine Learning, pages 4462–4484. PMLR, 2023.
  5. Deep conditional generative learning: Model and error analysis. arXiv preprint arXiv:2402.01460, 2024.
  6. Score diffusion models without early stopping: finite fisher information is all you need. arXiv preprint arXiv:2308.12240, 2023.
  7. Stochastic control liaisons: Richard sinkhorn meets gaspard monge on a schrodinger bridge. Siam Review, 63(2):249–313, 2021.
  8. Schrödinger bridges beat diffusion models on text-to-speech synthesis. arXiv preprint arXiv:2312.03491, 2023.
  9. Score approximation, estimation and distribution recovery of diffusion models on low-dimensional data. pages 4672–4712, 2023.
  10. Nonparametric regression on low-dimensional manifolds using deep relu networks: function approximation and statistical recovery. Information and Inference: A Journal of the IMA, 11(4):1203–1253, 03 2022.
  11. Improved analysis of score-based generative modeling: User-friendly bounds under minimal smoothness assumptions. In International Conference on Machine Learning, pages 4735–4763. PMLR, 2023.
  12. Distribution approximation and statistical estimation guarantees of generative adversarial networks. arXiv preprint arXiv:2002.03938, 2020.
  13. Diffusion schrödinger bridge with applications to score-based generative modeling. Advances in Neural Information Processing Systems, 34:17695–17709, 2021.
  14. Global optimization via schrödinger–föllmer diffusion. SIAM Journal on Control and Optimization, 61(5):2953–2980, 2023.
  15. Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34:8780–8794, 2021.
  16. Paolo Dai Pra. A stochastic control approach to reciprocal diffusion processes. Applied mathematics and Optimization, 23:313–329, 1991.
  17. Density estimation using real nvp. arXiv preprint arXiv:1605.08803, 2016.
  18. Convergence of continuous normalizing flows for learning probability distributions. arXiv preprint arXiv:2404.00551, 2024.
  19. Wasserstein convergence guarantees for a general class of score-based generative models. arXiv preprint arXiv:2311.11003, 2023.
  20. Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
  21. Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6(4), 2005.
  22. Generative modeling for time series via schrödinger bridge. arXiv preprint arXiv:2304.05093, 2023.
  23. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  24. Schrödinger-föllmer sampler: sampling without ergodicity. arXiv preprint arXiv:2106.10880, 2021.
  25. An error analysis of generative adversarial networks for learning distributions. The Journal of Machine Learning Research, 23(1):5047–5089, 2022.
  26. Error analysis of generative adversarial network. arXiv preprint arXiv:2310.15387, 2023.
  27. Cascaded diffusion models for high fidelity image generation. The Journal of Machine Learning Research, 23(1):2249–2281, 2022.
  28. Card: Classification and regression diffusion models. Advances in Neural Information Processing Systems, 35:18100–18115, 2022.
  29. Benton Jamison. The markov processes of schrödinger. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, 32(4):323–331, 1975.
  30. Convergence analysis of schrödinger-föllmer sampler without convexity. arXiv preprint arXiv:2107.04766, 2021.
  31. Convergence analysis of flow matching in latent space with transformers. arXiv preprint arXiv:2404.02538, 2024.
  32. Deep nonparametric regression on approximate manifolds: Nonasymptotic error bounds with polynomial prefactors. The Annals of Statistics, 51(2):691–716, 2023.
  33. Approximation bounds for norm constrained neural networks with applications to regression and gans. Applied and Computational Harmonic Analysis, 65:249–278, 2023.
  34. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  35. Christian Léonard. A survey of the schrödinger problem and some of its connections with optimal transport. Discrete and Continuous Dynamical Systems-Series A, 34(4):1533–1574, 2014.
  36. Convergence for score-based generative modeling with polynomial complexity. Advances in Neural Information Processing Systems, 35:22870–22882, 2022.
  37. Convergence of score-based generative modeling for general data distributions. In International Conference on Algorithmic Learning Theory, pages 946–985. PMLR, 2023.
  38. Deep network approximation for smooth functions. SIAM Journal on Mathematical Analysis, 53(5):5465–5506, 2021.
  39. Diffusion-lm improves controllable text generation. Advances in Neural Information Processing Systems, 35:4328–4343, 2022.
  40. I22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT sb: Image-to-image schrödinger bridge. arXiv preprint arXiv:2302.05872, 2023.
  41. Towards faster non-asymptotic convergence for diffusion-based generative models. arXiv preprint arXiv:2306.09251, 2023.
  42. Sora: A review on background, technology, limitations, and opportunities of large vision models. arXiv preprint arXiv:2402.17177, 2024.
  43. Improved denoising diffusion probabilistic models. In International Conference on Machine Learning, pages 8162–8171. PMLR, 2021.
  44. Diffusion models are minimax optimal distribution estimators. pages 26517–26582, 2023.
  45. Normalizing flows for probabilistic modeling and inference. The Journal of Machine Learning Research, 22(1):2617–2680, 2021.
  46. Optimal approximation of piecewise smooth functions using deep relu neural networks. Neural Networks, 108:296–330, 2018.
  47. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
  48. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 1(2):3, 2022.
  49. Variational inference with normalizing flows. In International conference on machine learning, pages 1530–1538. PMLR, 2015.
  50. Erwin Schrödinger. Sur la théorie relativiste de l’électron et l’interprétation de la mécanique quantique. In Annales de l’institut Henri Poincaré, volume 2, pages 269–310, 1932.
  51. Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, 35:36479–36494, 2022.
  52. Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pages 2256–2265. PMLR, 2015.
  53. Generative modeling by estimating gradients of the data distribution. Advances in neural information processing systems, 32, 2019.
  54. Improved techniques for training score-based generative models. Advances in neural information processing systems, 33:12438–12448, 2020.
  55. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2020.
  56. Aad W Van der Vaart. Asymptotic statistics, volume 3. Cambridge university press, 2000.
  57. Pascal Vincent. A connection between score matching and denoising autoencoders. Neural computation, 23(7):1661–1674, 2011.
  58. Score-based generative modeling in latent space. Advances in Neural Information Processing Systems, 34:11287–11302, 2021.
  59. Empirical processes. In Weak Convergence and Empirical Processes: With Applications to Statistics, pages 127–384. Springer, 2023.
  60. Deep generative learning via schrödinger bridge. In International Conference on Machine Learning, pages 10794–10804. PMLR, 2021.
  61. Dmitry Yarotsky. Error bounds for approximations with deep relu networks. Neural Networks, 94:103–114, 2017.
  62. Diffusion models: A comprehensive survey of methods and applications. ACM Computing Surveys, 56(4):1–39, 2023.
  63. Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3836–3847, 2023.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com