Variational Schrödinger Diffusion Models (2405.04795v4)
Abstract: Schr\"odinger bridge (SB) has emerged as the go-to method for optimizing transportation plans in diffusion models. However, SB requires estimating the intractable forward score functions, inevitably resulting in the costly implicit training loss based on simulated trajectories. To improve the scalability while preserving efficient transportation plans, we leverage variational inference to linearize the forward score functions (variational scores) of SB and restore simulation-free properties in training backward scores. We propose the variational Schr\"odinger diffusion model (VSDM), where the forward process is a multivariate diffusion and the variational scores are adaptively optimized for efficient transport. Theoretically, we use stochastic approximation to prove the convergence of the variational scores and show the convergence of the adaptively generated samples based on the optimal variational scores. Empirically, we test the algorithm in simulated examples and observe that VSDM is efficient in generations of anisotropic shapes and yields straighter sample trajectories compared to the single-variate diffusion. We also verify the scalability of the algorithm in real-world data and achieve competitive unconditional generation performance in CIFAR10 and conditional generation in time series modeling. Notably, VSDM no longer depends on warm-up initializations and has become tuning-friendly in training large-scale experiments.
- Building Normalizing Flows with Stochastic Interpolants. In International Conference on Learning Representation (ICLR), 2023.
- Stochastic Interpolants: A Unifying Framework for Flows and Diffusions. arXiv:2303.08797v1, pp. 1–48, 2023.
- Anderson, B. D. Reverse-time Diffusion Equation Models. Stochastic Processes and Their Applications, 12(3):313–326, 1982.
- Refining Deep Generative Models via Discriminator Gradient Flow. In International Conference on Learning Representations, 2020.
- Adaptive Algorithms and Stochastic Approximations. Berlin: Springer, 1990.
- Variational Inference: A Review for Statisticians. Journal of the American Statistical Association, 112 (518), 2017.
- The Schrödinger Bridge between Gaussian Measures has a Closed Form. In AISTATS, 2023.
- Wasserstein Proximal Algorithms for the Schrödinger Bridge Problem: Density Control with Nonlinear Drift. IEEE Transactions on Automatic Control, 67(3):1163–1178, 2022.
- Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences. Journal of Machine Learning Research, 2022.
- Improved Analysis of Score-based Generative Modeling: User-friendly Bounds under Minimal Smoothness Assumptions. In International Conference on Machine Learning, pp. 4735–4763, 2023a.
- Sampling is as Easy as Learning the Score: Theory for Diffusion Models with Minimal Data Assumptions. arXiv preprint arXiv:2209.11215v2, 2022a.
- Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory. In International Conference on Learning Representation (ICLR), 2022b.
- Generative Modeling with Phase Stochastic Bridges. In arXiv:2310.07805v2, 2023b.
- Stochastic Bridges of Linear Systems. IEEE Transactions on Automatic Control, 61(2), 2016.
- Stochastic Control Liaisons: Richard Sinkhorn Meets Gaspard Monge on a Schrödinger Bridge. SIAM Review, 63(2):249–313, 2021.
- Provably Convergent Schrödinger Bridge with Applications to Probabilistic Time Series Imputation. In ICML, 2023c.
- Chewi, S. Log-Concave Sampling. online draft, 2023.
- Diffusion Schrödinger Bridge with Applications to Score-Based Generative Modeling. In Advances in Neural Information Processing Systems (NeurIPS), 2021.
- Reflected Schrödinger Bridge for Constrained Generative Modeling. In Conference on Uncertainty in Artificial Intelligence (UAI), 2024.
- Diffusion Models Beat GANs on Image Synthesis. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
- Score-Based Generative Modeling with Critically-Damped Langevin Diffusion. In International Conference on Learning Representation (ICLR), 2022.
- How to Train Your Neural ODE: the World of Jacobian and Kinetic Regularization. In ICML, 2020.
- FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models. In International Conference on Learning Representation (ICLR), 2019.
- GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. In Neural Information Processing Systems, 2017.
- Denoising Diffusion Probabilistic Models. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
- Imagen Video: High Definition Video Generation with Diffusion Models. In arXiv:2210.02303, 2022.
- A Variational Perspective on Diffusion-Based Generative Models and Score Matching. In Advances in Neural Information Processing Systems (NeurIPS), 2021.
- Hutchinson, M. F. A Stochastic Estimator of the Trace of the Influence Matrix for Laplacian Smoothing Splines. Communications in Statistics-Simulation and Computation, 18(3):1059–1076, 1989.
- Hyvärinen, A. Estimation of Non-normalized Statistical Models by Score Matching. Journal of Machine Learning Research, 6(24):695–709, 2005.
- Entropic Optimal Transport between Unbalanced Gaussian Measures has a Closed Form. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
- Brownian Motion and Stochastic Calculus. Springer, 1998.
- Elucidating the Design Space of Diffusion-Based Generative Models. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
- Variational Diffusion Models. ArXiv, abs/2107.00630, 2021.
- Statistical Efficiency of Score Matching: The View from Isoperimetry. In ICLR, 2023.
- DiffWave: A Versatile Diffusion Model for Audio Synthesis . In Proc. of the International Conference on Learning Representation (ICLR), 2021.
- Kullback, S. Probability Densities with Given Marginals. Ann. Math. Statist., 1968.
- The Flow Map of the Fokker–Planck Equation Does Not Provide Optimal Transport. Applied Mathematics Letters, 133, 2022.
- Convergence for Score-based Generative Modeling with Polynomial Complexity. Advances in Neural Information Processing Systems (NeurIPS), 2022.
- Léonard, C. A Survey of the Schrödinger Problem and Some of its Connections with Optimal Transport. Discrete & Continuous Dynamical Systems-A, 34(4):1533–1574, 2014.
- Stochastic Approximation in Monte Carlo Computation. Journal of the American Statistical Association, 102:305–320, 2007.
- Flow Matching for Generative Modeling. In Proc. of the International Conference on Learning Representation (ICLR), 2023.
- Statistics of Random Processes: I. General Theory. Springer Science & Business Media, 2001.
- Liu, Q. Rectified Flow: A Marginal Preserving Approach to Optimal Transport. arXiv:2209.14577, 2022.
- Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow. In International Conference on Learning Representation (ICLR), 2023.
- DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
- Luo, W. A Comprehensive Survey on Knowledge Distillation of Diffusion Models. arXiv preprint arXiv:2304.04262, 2023.
- Diff-instruct: A Universal Approach for Transferring Knowledge from Pre-trained Diffusion Models. Advances in Neural Information Processing Systems, 36, 2024a.
- Entropy-based Training Methods for Scalable Neural Implicit Samplers. NeurIPS, 36, 2024b.
- Forward-Backward Stochastic Differential Equations and their Applications. Springer, 2007.
- Sampling via Measure Transport: An Introduction. Handbook of Uncertainty Quantification, pp. 1–41, 2016.
- McCann, R. J. A Convexity Principle for Interacting Gases. Advances in mathematics, 128(1):153–179, 1997.
- Øksendal, B. Stochastic Differential Equations: An Introduction with Applications. Springer, 2003.
- OT-Flow: Fast and Accurate Continuous Normalizing Flows via Optimal Transport. In Proc. of the National Conference on Artificial Intelligence (AAAI), 2021.
- The Data-driven Schrödinger Bridge. Communications on Pure and Applied Mathematics, 74:1545–1573, 2021.
- Peluchetti, S. Diffusion Bridge Mixture Transports, Schrödinger Bridge Problems and Generative Modeling. ArXiv e-prints arXiv:2304.00917v1, 2023.
- Computational Optimal Transport: With Applications to Data Science. Foundations and Trends in Machine Learning, 2019.
- Acceleration of Stochastic Approximation by Averaging. SIAM Journal on Control and Optimization, 30:838–855, 1992.
- Multisample Flow Matching: Straightening Flows with Minibatch Couplings. In ICML, 2023.
- Hierarchical Text-Conditional Image Generation with CLIP Latents. In arXiv:2204.06125v1, 2022.
- Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting. In International Conference on Machine Learning, 2021.
- A Stochastic Approximation Method. Annals of Mathematical Statistics, 22:400–407, 1951.
- Ruschendorf, L. Convergence of the Iterative Proportional Fitting Procedure. Ann. of Statistics, 1995.
- Progressive Distillation for Fast Sampling of Diffusion Models. In ICLR, 2022.
- High-dimensional Multivariate Forecasting with Low-rank Gaussian Copula Processes. Advances in neural information processing systems, 2019.
- Applied Stochastic Differential Equations. Cambridge University Press, 2019.
- Diffusion Schrödinger Bridge Matching. In Advances in Neural Information Processing Systems (NeurIPS), 2023.
- Where to Diffuse, How to Diffuse, and How to Get Back: Automated Learning for Multivariate Diffusions. In Proc. of the International Conference on Learning Representation (ICLR), 2023.
- Improved Techniques for Training Score-Based Generative Models. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
- Sliced Score Matching: A Scalable Approach to Density and Score Estimation. In Uncertainty in Artificial Intelligence, 2020.
- Maximum Likelihood Training of Score-Based Diffusion Models . In Advances in Neural Information Processing Systems (NeurIPS), 2021a.
- Score-Based Generative Modeling through Stochastic Differential Equations . In International Conference on Learning Representation (ICLR), 2021b.
- Tanaka, A. Discriminator Optimal Transport. In Neural Information Processing Systems, 2019.
- Improving and Generalizing Flow-based Generative Models with Minibatch Optimal Transport. arXiv:2302.00482v3, 2023.
- Optimization for Deep Neural Networks. Slides - University of Chicago, 2017.
- Score-based Generative Modeling in Latent Space. Advances in Neural Information Processing Systems, 34:11287–11302, 2021.
- Vanden-Eijnden, E. Introduction to Regular Perturbation Theory. Slides, 2001. URL https://cims.nyu.edu/~eve2/reg_pert.pdf.
- Solving Schrödinger Bridges via Maximum Likelihood. Entropy, 23(9):1134, 2021.
- Rapid Convergence of the Unadjusted Langevin Algorithm: Isoperimetry Suffices, 2022.
- Villani, C. Topics in Optimal Transportation, volume 58. American Mathematical Soc., 2003.
- Efficient MCMC Sampling with Dimension-Free Convergence Rate using ADMM-type Splitting. Journal of Machine Learning Research, 2022.
- SA-Solver: Stochastic Adams Solver for Fast Sampling of Diffusion Models. Advances in Neural Information Processing Systems, 2023.
- Enhancing Adversarial Robustness via Score-Based Optimization. Advances in Neural Information Processing Systems, 36, 2024.