Contractive Diffusion Probabilistic Models (2401.13115v3)
Abstract: Diffusion probabilistic models (DPMs) have emerged as a promising technique in generative modeling. The success of DPMs relies on two ingredients: time reversal of diffusion processes and score matching. In view of possibly unguaranteed score matching, we propose a new criterion -- the contraction property of backward sampling in the design of DPMs, leading to a novel class of contractive DPMs (CDPMs). Our key insight is that, the contraction property can provably narrow score matching errors and discretization errors, thus our proposed CDPMs are robust to both sources of error. For practical use, we show that CDPM can leverage weights of pretrained DPMs by a simple transformation, and does not need retraining. We corroborated our approach by experiments on synthetic 1-dim examples, Swiss Roll, MNIST, CIFAR-10 32$\times$32 and AFHQ 64$\times$64 dataset. Notably, CDPM steadily improves the performance of baseline score-based diffusion models.
- B. D. O. Anderson. Reverse-time diffusion equation models. Stochastic Process. Appl., 12(3):313–326, 1982.
- Structured denoising diffusion models in discrete state-spaces. In Neurips, volume 34, pages 17981–17993, 2021.
- Linear convergence bounds for diffusion models via stochastic localization. 2023. arXiv:2308.03686.
- Error bounds for flow matching methods. 2023. arXiv:2305.16860.
- A. Borji. Pros and cons of GAN evaluation measures. Comput. Vis. Image Underst., 179:41–65, 2019.
- AudioLM: a language modeling approach to audio generation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023.
- Large scale GAN training for high fidelity natural image synthesis. In ICLR, 2018.
- Language models are few-shot learners. In Neurips, volume 33, pages 1877–1901, 2020.
- Time reversal of diffusion processes under a finite entropy condition. Ann. Inst. Henri Poincaré Probab. Stat., 59(4):1844–1881, 2023.
- Score approximation, estimation and distribution recovery of diffusion models on low-dimensional data. In ICML, volume 40, pages 4672–4712, 2023.
- The probability flow ode is provably fast. 2023. arXiv:2305.11798.
- Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions. 2023.
- Restoration-degradation beyond linear diffusions: A non-asymptotic analysis for ddim-type samplers. In ICML, volume 40, pages 4462–4484, 2023.
- V. De Bortoli. Convergence of denoising diffusion models under the manifold hypothesis. 2022. arXiv:2208.05314.
- Diffusion Schrödinger bridge with applications to score-based generative modeling. In Neurips, volume 34, pages 17695–17709, 2021.
- P. Dhariwal and A. Nichol. Diffusion models beat GANs on image synthesis. In Neurips, volume 34, pages 8780–8794, 2021.
- Continuous diffusion for categorical data. 2022. arXiv:2211.15089.
- Improved contrastive divergence training of energy based models. 2020. arXiv:2012.01316.
- Diffusion models for constrained domains. Transactions on Machine Learning Research, 2023.
- H. Föllmer. An entropy approach to the time reversal of diffusion processes. In Stochastic differential systems (Marseille-Luminy, 1984), volume 69 of Lect. Notes Control Inf. Sci., pages 156–163. Springer, Berlin, 1985.
- H. Föllmer. Time reversal on Wiener space. In Stochastic processes—mathematics and physics (Bielefeld, 1984), volume 1158 of Lecture Notes in Math., pages 119–129. Springer, Berlin, 1986.
- Wasserstein convergence guarantees for a general class of score-based generative models. 2023. arXiv:2311.11003.
- Generative adversarial nets. In NIPS, volume 27, pages 2672–2680, 2014.
- U. G. Haussmann and E. Pardoux. Time reversal of diffusions. Ann. Probab., 14(4):1188–1205, 1986.
- Denoising diffusion probabilistic models. In Neurips, volume 33, pages 6840–6851, 2020.
- A. Hyvärinen. Estimation of non-normalized statistical models by score matching. J. Mach. Learn. Res., 6:695–709, 2005.
- Video pixel networks. In ICML, volume 34, pages 1771–1779, 2017.
- I. Karatzas and S. E. Shreve. Brownian motion and stochastic calculus, volume 113 of Graduate Texts in Mathematics. Springer-Verlag, New York, second edition, 1991.
- Training generative adversarial networks with limited data. In Neurips, volume 33, pages 12104–12114, 2020.
- Variational diffusion models. In Neurips, volume 34, pages 21696–21707, 2021.
- P. E. Kloeden and E. Platen. Numerical solution of stochastic differential equations, volume 23 of Applications of Mathematics (New York). Springer-Verlag, Berlin, 1992.
- Statistical efficiency of score matching: The view from isoperimetry. In ICLR, 2023.
- Diffwave: A versatile diffusion model for audio synthesis. In ICLR, 2021.
- Score-based generative modeling secretly minimizes the Wasserstein distance. In Neurips, volume 35, pages 20205–20217, 2022.
- Convergence for score-based generative modeling with polynomial complexity. In Neurips, volume 35, pages 22870–22882, 2022.
- Convergence of score-based generative modeling for general data distributions. In ALT, pages 946–985. PMLR, 2023.
- ViTGAN: Training GANs with vision transformers. In ICLR, 2022.
- Towards faster non-asymptotic convergence for diffusion-based generative models. 2023. arXiv:2306.09251.
- A. Lou and S. Ermon. Reflected diffusion models. 2023. arXiv:2304.04740.
- S. Mei and Y. Wu. Deep networks as denoising algorithms: Sample-efficient learning of diffusion models in high-dimensional graphical models. 2023. arXiv:2309.11420.
- Stochastic numerics for mathematical physics. Scientific Computation. Springer-Verlag, Berlin, 2004.
- Diffusion models are minimax optimal distribution estimators. In ICML, volume 40, page 26517–26582, 2023.
- J. Pidstrigach. Score-based generative models detect manifolds. In Neurips, volume 35, pages 35852–35865, 2022.
- Waveglow: A flow-based generative network for speech synthesis. In ICASSP, pages 3617–3621, 2019.
- J. Quastel. Time reversal of degenerate diffusions. In In and out of equilibrium (Mambucaba, 2000), volume 51 of Progr. Probab., pages 249–257. Birkhäuser Boston, Boston, MA, 2002.
- Hierarchical text-conditional image generation with clip latents. 2022. arXiv:2204.06125.
- Generating diverse high-fidelity images with VQ-VAE-2. In Neurips, volume 32, pages 14866–14876, 2019.
- Categorical SDEs with simplex diffusion. 2022. arXiv:2210.14784.
- High-resolution image synthesis with latent diffusion models. In CVPR, pages 10684–10695, 2022.
- Deep unsupervised learning using nonequilibrium thermodynamics. In ICML, volume 32, pages 2256–2265, 2015.
- Denoising diffusion implicit models. In ICLR, 2021.
- Consistency models. In ICML, volume 40, page 32211–32252, 2023.
- Maximum likelihood training of score-based diffusion models. In Neurips, volume 34, pages 1415–1428, 2021.
- Y. Song and S. Ermon. Generative modeling by estimating gradients of the data distribution. In Neurips, volume 32, page 11918–11930, 2019.
- Y. Song and S. Ermon. Improved techniques for training score-based generative models. In Neurips, volume 33, pages 12438–12448, 2020.
- Sliced score matching: A scalable approach to density and score estimation. In UAI, volume 35, pages 574–584, 2020.
- Score-based generative modeling through stochastic differential equations. In ICLR, 2021.
- W. Tang and X. Y. Zhou. Tail probability estimates of continuous-time simulated annealing processes. Numer. Algebra Control Optim., 13(3-4):473–485, 2023.
- Score-based generative modeling in latent space. In Neurips, volume 34, pages 11287–11302, 2021.
- Conditional image generation with pixelcnn decoders. In NIPS, volume 29, pages 4797–4805, 2016.
- Attention is all you need. In NIPS, volume 30, pages 6000–6010, 2017.
- P. Vincent. A connection between score matching and denoising autoencoders. Neural Comput., 23(7):1661–1674, 2011.
- Poisson flow generative models. In Neurips, volume 35, pages 16782–16795, 2022.
- Diffusion models: A comprehensive survey of methods and applications. ACM Comput. Surv., 56(4):1–39, 2023.
- XLNet: Generalized autoregressive pretraining for language understanding. In Neurips, volume 32, pages 5753–5763, 2019.
- Policy optimization for continuous reinforcement learning. 2023. arXiv:2305.18901.