Multistep Consistency Models (2403.06807v3)
Abstract: Diffusion models are relatively easy to train but require many steps to generate samples. Consistency models are far more difficult to train, but generate samples in a single step. In this paper we propose Multistep Consistency Models: A unification between Consistency Models (Song et al., 2023) and TRACT (Berthelot et al., 2023) that can interpolate between a consistency model and a diffusion model: a trade-off between sampling speed and sampling quality. Specifically, a 1-step consistency model is a conventional consistency model whereas a $\infty$-step consistency model is a diffusion model. Multistep Consistency Models work really well in practice. By increasing the sample budget from a single step to 2-8 steps, we can train models more easily that generate higher quality samples, while retaining much of the sampling speed benefits. Notable results are 1.4 FID on Imagenet 64 in 8 step and 2.1 FID on Imagenet128 in 8 steps with consistency distillation, using simple losses without adversarial training. We also show that our method scales to a text-to-image diffusion model, generating samples that are close to the quality of the original model.
- TRACT: denoising diffusion models with transitive closure time-distillation. CoRR, abs/2303.04248, 2023.
- Classifier-free diffusion guidance. CoRR, abs/2207.12598, 2022. doi: 10.48550/arXiv.2207.12598. URL https://doi.org/10.48550/arXiv.2207.12598.
- Denoising diffusion probabilistic models. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H. (eds.), Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS, 2020.
- Gotta go fast when generating data with score-based models. CoRR, abs/2105.14080, 2021.
- Elucidating the design space of diffusion-based generative models. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS, 2022.
- Understanding the diffusion objective as a weighted integral of elbos. CoRR, abs/2303.00848, 2023.
- Variational diffusion models. CoRR, abs/2107.00630, 2021.
- DiffWave: A versatile diffusion model for audio synthesis. In 9th International Conference on Learning Representations, ICLR, 2021.
- Flow matching for generative modeling. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda. OpenReview.net, 2023.
- Flow straight and fast: Learning to generate and transfer data with rectified flow. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023. URL https://openreview.net/pdf?id=XVjTT1nw5z.
- Power hungry processing: Watts driving the cost of ai deployment? arXiv preprint arXiv:2311.16863, 2023.
- Diff-instruct: A universal approach for transferring knowledge from pre-trained diffusion models. CoRR, abs/2305.18455, 2023.
- On distillation of guided diffusion models. CoRR, abs/2210.03142, 2022.
- Photorealistic text-to-image diffusion models with deep language understanding. CoRR, abs/2205.11487, 2022.
- Progressive distillation for fast sampling of diffusion models. In The Tenth International Conference on Learning Representations, ICLR. OpenReview.net, 2022.
- Deep unsupervised learning using nonequilibrium thermodynamics. In Bach, F. R. and Blei, D. M. (eds.), Proceedings of the 32nd International Conference on Machine Learning, ICML, 2015.
- Denoising diffusion implicit models. In 9th International Conference on Learning Representations, ICLR, 2021a.
- Improved techniques for training consistency models. CoRR, abs/2310.14189, 2023.
- Score-based generative modeling through stochastic differential equations. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021b.
- Consistency models. In International Conference on Machine Learning, ICML, 2023.
- Fast sampling of diffusion models via operator learning. In International Conference on Machine Learning, ICML, 2023.
- Jonathan Heek (13 papers)
- Emiel Hoogeboom (26 papers)
- Tim Salimans (46 papers)