Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Accelerating Diffusion Sampling with Optimized Time Steps (2402.17376v3)

Published 27 Feb 2024 in cs.CV, cs.AI, and cs.LG

Abstract: Diffusion probabilistic models (DPMs) have shown remarkable performance in high-resolution image synthesis, but their sampling efficiency is still to be desired due to the typically large number of sampling steps. Recent advancements in high-order numerical ODE solvers for DPMs have enabled the generation of high-quality images with much fewer sampling steps. While this is a significant development, most sampling methods still employ uniform time steps, which is not optimal when using a small number of steps. To address this issue, we propose a general framework for designing an optimization problem that seeks more appropriate time steps for a specific numerical ODE solver for DPMs. This optimization problem aims to minimize the distance between the ground-truth solution to the ODE and an approximate solution corresponding to the numerical solver. It can be efficiently solved using the constrained trust region method, taking less than $15$ seconds. Our extensive experiments on both unconditional and conditional sampling using pixel- and latent-space DPMs demonstrate that, when combined with the state-of-the-art sampling method UniPC, our optimized time steps significantly improve image generation performance in terms of FID scores for datasets such as CIFAR-10 and ImageNet, compared to using uniform time steps.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. Brian D.O. Anderson. Reverse-time diffusion equation models. Stochastic Processes and their Applications, 12(3):313–326, 1982.
  2. Analytic-DPM: An analytic estimate of the optimal reverse variance in diffusion probabilistic models. In International Conference on Learning Representations, 2022.
  3. Align your latents: High-resolution video synthesis with latent diffusion models. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  4. Improved analysis of score-based generative modeling: User-friendly bounds under minimal smoothness assumptions. In International Conference on Machine Learning, pages 4735–4763. PMLR, 2023a.
  5. Pixart-α𝛼\alphaitalic_α: Fast training of diffusion transformer for photorealistic text-to-image synthesis, 2023b.
  6. Score approximation, estimation and distribution recovery of diffusion models on low-dimensional data. arXiv preprint arXiv:2302.07194, 2023c.
  7. The probability flow ode is provably fast. arXiv preprint arXiv:2305.11798, 2023d.
  8. Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions. In International Conference on Learning Representations, 2023e.
  9. Valentin De Bortoli. Convergence of denoising diffusion models under the manifold hypothesis. Transactions on Machine Learning Research, 2022.
  10. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255. IEEE, 2009.
  11. Diffusion models beat GANs on image synthesis. In Advances in Neural Information Processing Systems, pages 8780–8794, 2021.
  12. Fast diffusion probabilistic model sampling through the lens of backward error analysis. arXiv preprint arXiv:2304.11446, 2023.
  13. Generative adversarial nets. In Advances in Neural Information Processing Systems, pages 2672–2680, 2014.
  14. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In Advances in Neural Information Processing Systems, pages 6626–6637, 2017.
  15. Denoising diffusion probabilistic models. In Advances in Neural Information Processing Systems, pages 6840–6851, 2020.
  16. Cascaded diffusion models for high fidelity image generation. Journal of Machine Learning Research, 23(47):1–33, 2022a.
  17. Video diffusion models. In Advances in Neural Information Processing Systems, 2022b.
  18. Gotta go fast when generating data with score-based models. arXiv preprint arXiv:2105.14080, 2021.
  19. Elucidating the design space of diffusion-based generative models. In Proc. NeurIPS, 2022.
  20. Soft truncation: A universal training technique of score-based diffusion model for high precision score estimation. In ICML, pages 11201–11228. PMLR, 2022.
  21. Auto-encoding variational bayes. In International Conference on Learning Representations, 2014.
  22. Variational diffusion models. In Advances in Neural Information Processing Systems, 2021.
  23. The CIFAR-10 Dataset. online: http://www. cs. toronto. edu/kriz/cifar. html, 55, 2014.
  24. Bddm: Bilateral denoising diffusion models for fast and high-quality speech synthesis. In International Conference on Learning Representations, 2022.
  25. Convergence for score-based generative modeling with polynomial complexity. Advances in Neural Information Processing Systems, 35:22870–22882, 2022.
  26. Convergence of score-based generative modeling for general data distributions. In International Conference on Algorithmic Learning Theory, pages 946–985. PMLR, 2023.
  27. Autodiffusion: Training-free optimization of time steps and architectures for automated diffusion model acceleration. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7105–7114, 2023a.
  28. Scire-solver: Accelerating diffusion models sampling by score-integrand solver with recursive difference. 2023b.
  29. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.
  30. Oms-dpm: Optimizing the model schedule for diffusion probabilistic models. arXiv preprint arXiv:2306.08860, 2023.
  31. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. In Advances in Neural Information Processing Systems, pages 5775–5787, 2022.
  32. Dpm-solver++: Fast solver for guided sampling of diffusion probabilistic models, 2023.
  33. Knowledge distillation in iterative generative models for improved sampling speed. arXiv preprint arXiv:2101.02388, 2021.
  34. Diff-instruct: A universal approach for transferring knowledge from pre-trained diffusion models, 2023.
  35. On distillation of guided diffusion models. In NeurIPS 2022 Workshop on Score-Based Methods, 2022.
  36. GLIDE: towards photorealistic image generation and editing with text-guided diffusion models. In International Conference on Machine Learning (ICML), 2022.
  37. Improved convergence of score-based diffusion models via prediction-correction. arXiv preprint arXiv:2305.14164, 2023.
  38. Scalable diffusion models with transformers. arXiv preprint arXiv:2212.09748, 2022.
  39. Hierarchical text-conditional image generation with clip latents, 2022.
  40. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, 2022.
  41. Photorealistic text-to-image diffusion models with deep language understanding. In Advances in Neural Information Processing Systems, 2022.
  42. Progressive distillation for fast sampling of diffusion models. In International Conference on Learning Representations, 2022.
  43. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning, pages 2256–2265. PMLR, 2015.
  44. Denoising diffusion implicit models. In International Conference on Learning Representations, 2021a.
  45. Transcormer: Transformer for sentence scoring with sliding language modeling. In Advances in Neural Information Processing Systems, 2022.
  46. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021b.
  47. Consistency models. arXiv preprint arXiv:2303.01469, 2023.
  48. Learning to schedule in diffusion probabilistic models. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 2478–2488, 2023a.
  49. Diffusion-GAN: Training GANs with diffusion. In The Eleventh International Conference on Learning Representations, 2023b.
  50. Learning fast samplers for diffusion models by differentiating through sample quality. In International Conference on Learning Representations, 2022.
  51. Towards more accurate diffusion model acceleration with a timestep aligner. arXiv preprint arXiv:2310.09469, 2023.
  52. Tackling the generative learning trilemma with denoising diffusion GANs. In International Conference on Learning Representations, 2022.
  53. Sa-solver: Stochastic adams solver for fast sampling of diffusion models, 2023.
  54. Fast sampling of diffusion models with exponential integrator. In The Eleventh International Conference on Learning Representations, 2023.
  55. Unipc: A unified predictor-corrector framework for fast sampling of diffusion models. arXiv preprint arXiv:2302.04867, 2023.
Citations (8)

Summary

We haven't generated a summary for this paper yet.