Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Variational Schrödinger Diffusion Models (2405.04795v4)

Published 8 May 2024 in cs.LG

Abstract: Schr\"odinger bridge (SB) has emerged as the go-to method for optimizing transportation plans in diffusion models. However, SB requires estimating the intractable forward score functions, inevitably resulting in the costly implicit training loss based on simulated trajectories. To improve the scalability while preserving efficient transportation plans, we leverage variational inference to linearize the forward score functions (variational scores) of SB and restore simulation-free properties in training backward scores. We propose the variational Schr\"odinger diffusion model (VSDM), where the forward process is a multivariate diffusion and the variational scores are adaptively optimized for efficient transport. Theoretically, we use stochastic approximation to prove the convergence of the variational scores and show the convergence of the adaptively generated samples based on the optimal variational scores. Empirically, we test the algorithm in simulated examples and observe that VSDM is efficient in generations of anisotropic shapes and yields straighter sample trajectories compared to the single-variate diffusion. We also verify the scalability of the algorithm in real-world data and achieve competitive unconditional generation performance in CIFAR10 and conditional generation in time series modeling. Notably, VSDM no longer depends on warm-up initializations and has become tuning-friendly in training large-scale experiments.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (82)
  1. Building Normalizing Flows with Stochastic Interpolants. In International Conference on Learning Representation (ICLR), 2023.
  2. Stochastic Interpolants: A Unifying Framework for Flows and Diffusions. arXiv:2303.08797v1, pp.  1–48, 2023.
  3. Anderson, B. D. Reverse-time Diffusion Equation Models. Stochastic Processes and Their Applications, 12(3):313–326, 1982.
  4. Refining Deep Generative Models via Discriminator Gradient Flow. In International Conference on Learning Representations, 2020.
  5. Adaptive Algorithms and Stochastic Approximations. Berlin: Springer, 1990.
  6. Variational Inference: A Review for Statisticians. Journal of the American Statistical Association, 112 (518), 2017.
  7. The Schrödinger Bridge between Gaussian Measures has a Closed Form. In AISTATS, 2023.
  8. Wasserstein Proximal Algorithms for the Schrödinger Bridge Problem: Density Control with Nonlinear Drift. IEEE Transactions on Automatic Control, 67(3):1163–1178, 2022.
  9. Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences. Journal of Machine Learning Research, 2022.
  10. Improved Analysis of Score-based Generative Modeling: User-friendly Bounds under Minimal Smoothness Assumptions. In International Conference on Machine Learning, pp. 4735–4763, 2023a.
  11. Sampling is as Easy as Learning the Score: Theory for Diffusion Models with Minimal Data Assumptions. arXiv preprint arXiv:2209.11215v2, 2022a.
  12. Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory. In International Conference on Learning Representation (ICLR), 2022b.
  13. Generative Modeling with Phase Stochastic Bridges. In arXiv:2310.07805v2, 2023b.
  14. Stochastic Bridges of Linear Systems. IEEE Transactions on Automatic Control, 61(2), 2016.
  15. Stochastic Control Liaisons: Richard Sinkhorn Meets Gaspard Monge on a Schrödinger Bridge. SIAM Review, 63(2):249–313, 2021.
  16. Provably Convergent Schrödinger Bridge with Applications to Probabilistic Time Series Imputation. In ICML, 2023c.
  17. Chewi, S. Log-Concave Sampling. online draft, 2023.
  18. Diffusion Schrödinger Bridge with Applications to Score-Based Generative Modeling. In Advances in Neural Information Processing Systems (NeurIPS), 2021.
  19. Reflected Schrödinger Bridge for Constrained Generative Modeling. In Conference on Uncertainty in Artificial Intelligence (UAI), 2024.
  20. Diffusion Models Beat GANs on Image Synthesis. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
  21. Score-Based Generative Modeling with Critically-Damped Langevin Diffusion. In International Conference on Learning Representation (ICLR), 2022.
  22. How to Train Your Neural ODE: the World of Jacobian and Kinetic Regularization. In ICML, 2020.
  23. FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models. In International Conference on Learning Representation (ICLR), 2019.
  24. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. In Neural Information Processing Systems, 2017.
  25. Denoising Diffusion Probabilistic Models. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
  26. Imagen Video: High Definition Video Generation with Diffusion Models. In arXiv:2210.02303, 2022.
  27. A Variational Perspective on Diffusion-Based Generative Models and Score Matching. In Advances in Neural Information Processing Systems (NeurIPS), 2021.
  28. Hutchinson, M. F. A Stochastic Estimator of the Trace of the Influence Matrix for Laplacian Smoothing Splines. Communications in Statistics-Simulation and Computation, 18(3):1059–1076, 1989.
  29. Hyvärinen, A. Estimation of Non-normalized Statistical Models by Score Matching. Journal of Machine Learning Research, 6(24):695–709, 2005.
  30. Entropic Optimal Transport between Unbalanced Gaussian Measures has a Closed Form. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
  31. Brownian Motion and Stochastic Calculus. Springer, 1998.
  32. Elucidating the Design Space of Diffusion-Based Generative Models. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
  33. Variational Diffusion Models. ArXiv, abs/2107.00630, 2021.
  34. Statistical Efficiency of Score Matching: The View from Isoperimetry. In ICLR, 2023.
  35. DiffWave: A Versatile Diffusion Model for Audio Synthesis . In Proc. of the International Conference on Learning Representation (ICLR), 2021.
  36. Kullback, S. Probability Densities with Given Marginals. Ann. Math. Statist., 1968.
  37. The Flow Map of the Fokker–Planck Equation Does Not Provide Optimal Transport. Applied Mathematics Letters, 133, 2022.
  38. Convergence for Score-based Generative Modeling with Polynomial Complexity. Advances in Neural Information Processing Systems (NeurIPS), 2022.
  39. Léonard, C. A Survey of the Schrödinger Problem and Some of its Connections with Optimal Transport. Discrete & Continuous Dynamical Systems-A, 34(4):1533–1574, 2014.
  40. Stochastic Approximation in Monte Carlo Computation. Journal of the American Statistical Association, 102:305–320, 2007.
  41. Flow Matching for Generative Modeling. In Proc. of the International Conference on Learning Representation (ICLR), 2023.
  42. Statistics of Random Processes: I. General Theory. Springer Science & Business Media, 2001.
  43. Liu, Q. Rectified Flow: A Marginal Preserving Approach to Optimal Transport. arXiv:2209.14577, 2022.
  44. Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow. In International Conference on Learning Representation (ICLR), 2023.
  45. DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
  46. Luo, W. A Comprehensive Survey on Knowledge Distillation of Diffusion Models. arXiv preprint arXiv:2304.04262, 2023.
  47. Diff-instruct: A Universal Approach for Transferring Knowledge from Pre-trained Diffusion Models. Advances in Neural Information Processing Systems, 36, 2024a.
  48. Entropy-based Training Methods for Scalable Neural Implicit Samplers. NeurIPS, 36, 2024b.
  49. Forward-Backward Stochastic Differential Equations and their Applications. Springer, 2007.
  50. Sampling via Measure Transport: An Introduction. Handbook of Uncertainty Quantification, pp.  1–41, 2016.
  51. McCann, R. J. A Convexity Principle for Interacting Gases. Advances in mathematics, 128(1):153–179, 1997.
  52. Øksendal, B. Stochastic Differential Equations: An Introduction with Applications. Springer, 2003.
  53. OT-Flow: Fast and Accurate Continuous Normalizing Flows via Optimal Transport. In Proc. of the National Conference on Artificial Intelligence (AAAI), 2021.
  54. The Data-driven Schrödinger Bridge. Communications on Pure and Applied Mathematics, 74:1545–1573, 2021.
  55. Peluchetti, S. Diffusion Bridge Mixture Transports, Schrödinger Bridge Problems and Generative Modeling. ArXiv e-prints arXiv:2304.00917v1, 2023.
  56. Computational Optimal Transport: With Applications to Data Science. Foundations and Trends in Machine Learning, 2019.
  57. Acceleration of Stochastic Approximation by Averaging. SIAM Journal on Control and Optimization, 30:838–855, 1992.
  58. Multisample Flow Matching: Straightening Flows with Minibatch Couplings. In ICML, 2023.
  59. Hierarchical Text-Conditional Image Generation with CLIP Latents. In arXiv:2204.06125v1, 2022.
  60. Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting. In International Conference on Machine Learning, 2021.
  61. A Stochastic Approximation Method. Annals of Mathematical Statistics, 22:400–407, 1951.
  62. Ruschendorf, L. Convergence of the Iterative Proportional Fitting Procedure. Ann. of Statistics, 1995.
  63. Progressive Distillation for Fast Sampling of Diffusion Models. In ICLR, 2022.
  64. High-dimensional Multivariate Forecasting with Low-rank Gaussian Copula Processes. Advances in neural information processing systems, 2019.
  65. Applied Stochastic Differential Equations. Cambridge University Press, 2019.
  66. Diffusion Schrödinger Bridge Matching. In Advances in Neural Information Processing Systems (NeurIPS), 2023.
  67. Where to Diffuse, How to Diffuse, and How to Get Back: Automated Learning for Multivariate Diffusions. In Proc. of the International Conference on Learning Representation (ICLR), 2023.
  68. Improved Techniques for Training Score-Based Generative Models. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
  69. Sliced Score Matching: A Scalable Approach to Density and Score Estimation. In Uncertainty in Artificial Intelligence, 2020.
  70. Maximum Likelihood Training of Score-Based Diffusion Models . In Advances in Neural Information Processing Systems (NeurIPS), 2021a.
  71. Score-Based Generative Modeling through Stochastic Differential Equations . In International Conference on Learning Representation (ICLR), 2021b.
  72. Tanaka, A. Discriminator Optimal Transport. In Neural Information Processing Systems, 2019.
  73. Improving and Generalizing Flow-based Generative Models with Minibatch Optimal Transport. arXiv:2302.00482v3, 2023.
  74. Optimization for Deep Neural Networks. Slides - University of Chicago, 2017.
  75. Score-based Generative Modeling in Latent Space. Advances in Neural Information Processing Systems, 34:11287–11302, 2021.
  76. Vanden-Eijnden, E. Introduction to Regular Perturbation Theory. Slides, 2001. URL https://cims.nyu.edu/~eve2/reg_pert.pdf.
  77. Solving Schrödinger Bridges via Maximum Likelihood. Entropy, 23(9):1134, 2021.
  78. Rapid Convergence of the Unadjusted Langevin Algorithm: Isoperimetry Suffices, 2022.
  79. Villani, C. Topics in Optimal Transportation, volume 58. American Mathematical Soc., 2003.
  80. Efficient MCMC Sampling with Dimension-Free Convergence Rate using ADMM-type Splitting. Journal of Machine Learning Research, 2022.
  81. SA-Solver: Stochastic Adams Solver for Fast Sampling of Diffusion Models. Advances in Neural Information Processing Systems, 2023.
  82. Enhancing Adversarial Robustness via Score-Based Optimization. Advances in Neural Information Processing Systems, 36, 2024.
Citations (2)

Summary

  • The paper presents a novel variational approach that linearizes complex functions, eliminating the need for simulated trajectories in diffusion model training.
  • The paper validates its method with theoretical convergence guarantees using stochastic approximation and demonstrates competitive performance on datasets like CIFAR10.
  • The paper shows practical scalability and efficient sample generation, paving the way for real-time applications in image and audio synthesis.

Understanding Variational Schrödinger Diffusion Models

Introduction

Advances in diffusion models have significantly impacted various domains like image and audio synthesis. Traditional diffusion models, however, grapple with optimal transport properties that can make them inefficient for certain applications. A novel approach, leveraging the Schrödinger bridge problem, enhances these models by optimizing transport plans but at the cost of increased computational overhead due to intractable functions that necessitate simulated trajectories.

Addressing these concerns, the paper discusses the Variational Schrödinger Diffusion Model (VSDM), which simplifies these complexities through variational inference. This method linearizes the problematic functions, thereby making training more feasible without reliance on simulations.

Key Contributions of Variational Schrödinger Diffusion Models

The shift to a variational framework brings several notable enhancements and findings:

  • Efficiency in Training: By approximating complex forward score functions with linear forms, VSDM reintroduces simulation-free properties in training, which boosts overall efficiency.
  • Theoretical Robustness: Convergence of the variational scores is backed by stochastic approximation theories, ensuring that even with approximations, the system remains robust and converges effectively under certain conditions.
  • Practical Scalability: The authors tested VSDM on complex data shapes and real-world datasets like CIFAR10, demonstrating competitive performance without the need for extensive tuning typically seen in large models.
  • Straightforward Sample Trajectories: The VSDM encourages straighter and more efficient paths in the sample space, improving the quality of generation particularly in anisotropic data distributions.

Theoretical Implications

From a theoretical standpoint, VSDM introduces a balance by approximating certain components of the Schrödinger bridge problem, thus altering the traditional but computationally expensive approaches. This balance between computational feasibility and theoretical accuracy could pave the way for new research, especially in how variational methods can be applied to other complex models in machine learning.

Practical Implications

Practically, VSDM's ability to operate without extensive pre-computed simulations makes it a strong candidate for real-time applications or scenarios where computational resources are limited. Its performance on standard benchmarks like CIFAR10 illustrates its capability to handle complex, high-dimensional data efficiently, which is promising for applications in graphics generation, advanced simulations, and more.

Future Directions

The introduction of VSDM is a significant step, but the journey doesn't end here. The authors speculate that future developments might explore more dynamic approximations or even extend these techniques to other forms of differential equations used in modeling and simulation. There's also potential in exploring how different forms of variational inference can further optimize the trade-off between computational overhead and transport efficiency in diffusion models.

Conclusion

With VSDM, we witness a meaningful evolution in diffusion models, pushing the boundaries of efficiency and scalability while maintaining robust theoretical foundations. Its ability to generate quality data with reduced computational demands opens new avenues for both academic exploration and practical application in the field of AI and machine learning.