BayesDiff: Estimating Pixel-wise Uncertainty in Diffusion via Bayesian Inference (2310.11142v2)
Abstract: Diffusion models have impressive image generation capability, but low-quality generations still exist, and their identification remains challenging due to the lack of a proper sample-wise metric. To address this, we propose BayesDiff, a pixel-wise uncertainty estimator for generations from diffusion models based on Bayesian inference. In particular, we derive a novel uncertainty iteration principle to characterize the uncertainty dynamics in diffusion, and leverage the last-layer Laplace approximation for efficient Bayesian inference. The estimated pixel-wise uncertainty can not only be aggregated into a sample-wise metric to filter out low-fidelity images but also aids in augmenting successful generations and rectifying artifacts in failed generations in text-to-image tasks. Extensive experiments demonstrate the efficacy of BayesDiff and its promise for practical applications.
- Analytic-dpm: an analytic estimate of the optimal reverse variance in diffusion probabilistic models. In International Conference on Learning Representations, 2021.
- All are worth words: A vit backbone for diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 22669–22679, 2023.
- Weight uncertainty in neural network. In International Conference on Machine Learning, pp. 1613–1622, 2015.
- Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349, 2015.
- Stochastic gradient hamiltonian monte carlo. In International Conference on Machine Learning, pp. 1683–1691. PMLR, 2014.
- Laplace redux-effortless bayesian deep learning. Advances in Neural Information Processing Systems, 34:20089–20103, 2021a.
- Bayesian deep learning via subnetwork inference. In International Conference on Machine Learning, pp. 2510–2521. PMLR, 2021b.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. Ieee, 2009.
- Libre: A practical bayesian approach to adversarial detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 972–982, 2021.
- Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34:8780–8794, 2021.
- ’in-between’uncertainty in bayesian neural networks. arXiv preprint arXiv:1906.11537, 2019.
- Vector quantized diffusion model for text-to-image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10696–10706, 2022.
- World models. arXiv preprint arXiv:1803.10122, 2018.
- Probabilistic backpropagation for scalable learning of Bayesian neural networks. In International Conference on Machine Learning, pp. 1861–1869, 2015.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
- Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
- Deep feature consistent variational autoencoder. In 2017 IEEE winter conference on applications of computer vision (WACV), pp. 1133–1141. IEEE, 2017.
- Prodiff: Progressive fast diffusion model for high-quality text-to-speech. In Proceedings of the 30th ACM International Conference on Multimedia, pp. 2595–2605, 2022.
- Elucidating the design space of diffusion-based generative models. Advances in Neural Information Processing Systems, 35:26565–26577, 2022.
- Fast and scalable Bayesian deep learning by weight-perturbation in adam. In International Conference on Machine Learning, pp. 2616–2625, 2018.
- Guided-tts: A diffusion model for text-to-speech via classifier guidance. In International Conference on Machine Learning, pp. 11119–11133. PMLR, 2022.
- Variational diffusion models. Advances in neural information processing systems, 34:21696–21707, 2021.
- Auto-encoding variational bayes. stat, 1050:1, 2014.
- Being bayesian, even just a bit, fixes overconfidence in relu networks. In International conference on machine learning, pp. 5436–5446. PMLR, 2020.
- Improved precision and recall metric for assessing generative models. Advances in Neural Information Processing Systems, 32, 2019.
- Stein variational gradient descent: A general purpose Bayesian inference algorithm. In Advances in Neural Information Processing Systems, pp. 2378–2386, 2016.
- Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), December 2015.
- Structured and efficient variational deep learning with matrix gaussian posteriors. In International Conference on Machine Learning, pp. 1708–1716, 2016.
- Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. Advances in Neural Information Processing Systems, 35:5775–5787, 2022.
- Repaint: Inpainting using denoising diffusion probabilistic models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11461–11471, 2022.
- Antigen-specific antibody design and optimization with diffusion-based generative models for protein structures. Advances in Neural Information Processing Systems, 35:9754–9767, 2022.
- David John Cameron Mackay. Bayesian methods for adaptive models. PhD thesis, California Institute of Technology, 1992.
- A simple baseline for bayesian uncertainty in deep learning. Advances in neural information processing systems, 32, 2019.
- A scalable laplace approximation for neural networks. In 6th International Conference on Learning Representations, volume 6. International Conference on Representation Learning, 2018.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10684–10695, 2022.
- Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, 35:36479–36494, 2022.
- Improved techniques for training gans. Advances in neural information processing systems, 29, 2016.
- A hybrid convolutional variational autoencoder for text generation. arXiv preprint arXiv:1702.02390, 2017.
- Denoising diffusion implicit models. In International Conference on Learning Representations, 2020.
- Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021.
- Topic-guided variational autoencoders for text generation. In NAACL HLT 2019-2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies-Proceedings of the Conference, pp. 166–177. Association for Computational Linguistics (ACL), 2019.
- Bayesian learning via stochastic gradient langevin dynamics. In International Conference on Machine Learning, pp. 681–688, 2011.
- Better aligning text-to-image models with human preference. arXiv e-prints, pp. arXiv–2303, 2023.
- Geodiff: A geometric diffusion model for molecular conformation generation. In International Conference on Learning Representations, 2021.
- Text-to-image diffusion model in generative ai: A survey. arXiv preprint arXiv:2303.07909, 2023.
- Noisy natural gradient as variational inference. In International Conference on Machine Learning, pp. 5847–5856, 2018.
- Cyclical stochastic gradient mcmc for bayesian deep learning. arXiv preprint arXiv:1902.03932, 2019.