Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

BayesDiff: Estimating Pixel-wise Uncertainty in Diffusion via Bayesian Inference (2310.11142v2)

Published 17 Oct 2023 in cs.CV and cs.LG

Abstract: Diffusion models have impressive image generation capability, but low-quality generations still exist, and their identification remains challenging due to the lack of a proper sample-wise metric. To address this, we propose BayesDiff, a pixel-wise uncertainty estimator for generations from diffusion models based on Bayesian inference. In particular, we derive a novel uncertainty iteration principle to characterize the uncertainty dynamics in diffusion, and leverage the last-layer Laplace approximation for efficient Bayesian inference. The estimated pixel-wise uncertainty can not only be aggregated into a sample-wise metric to filter out low-fidelity images but also aids in augmenting successful generations and rectifying artifacts in failed generations in text-to-image tasks. Extensive experiments demonstrate the efficacy of BayesDiff and its promise for practical applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Analytic-dpm: an analytic estimate of the optimal reverse variance in diffusion probabilistic models. In International Conference on Learning Representations, 2021.
  2. All are worth words: A vit backbone for diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  22669–22679, 2023.
  3. Weight uncertainty in neural network. In International Conference on Machine Learning, pp. 1613–1622, 2015.
  4. Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349, 2015.
  5. Stochastic gradient hamiltonian monte carlo. In International Conference on Machine Learning, pp. 1683–1691. PMLR, 2014.
  6. Laplace redux-effortless bayesian deep learning. Advances in Neural Information Processing Systems, 34:20089–20103, 2021a.
  7. Bayesian deep learning via subnetwork inference. In International Conference on Machine Learning, pp. 2510–2521. PMLR, 2021b.
  8. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp.  248–255. Ieee, 2009.
  9. Libre: A practical bayesian approach to adversarial detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  972–982, 2021.
  10. Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34:8780–8794, 2021.
  11. ’in-between’uncertainty in bayesian neural networks. arXiv preprint arXiv:1906.11537, 2019.
  12. Vector quantized diffusion model for text-to-image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  10696–10706, 2022.
  13. World models. arXiv preprint arXiv:1803.10122, 2018.
  14. Probabilistic backpropagation for scalable learning of Bayesian neural networks. In International Conference on Machine Learning, pp. 1861–1869, 2015.
  15. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
  16. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
  17. Deep feature consistent variational autoencoder. In 2017 IEEE winter conference on applications of computer vision (WACV), pp.  1133–1141. IEEE, 2017.
  18. Prodiff: Progressive fast diffusion model for high-quality text-to-speech. In Proceedings of the 30th ACM International Conference on Multimedia, pp.  2595–2605, 2022.
  19. Elucidating the design space of diffusion-based generative models. Advances in Neural Information Processing Systems, 35:26565–26577, 2022.
  20. Fast and scalable Bayesian deep learning by weight-perturbation in adam. In International Conference on Machine Learning, pp. 2616–2625, 2018.
  21. Guided-tts: A diffusion model for text-to-speech via classifier guidance. In International Conference on Machine Learning, pp. 11119–11133. PMLR, 2022.
  22. Variational diffusion models. Advances in neural information processing systems, 34:21696–21707, 2021.
  23. Auto-encoding variational bayes. stat, 1050:1, 2014.
  24. Being bayesian, even just a bit, fixes overconfidence in relu networks. In International conference on machine learning, pp. 5436–5446. PMLR, 2020.
  25. Improved precision and recall metric for assessing generative models. Advances in Neural Information Processing Systems, 32, 2019.
  26. Stein variational gradient descent: A general purpose Bayesian inference algorithm. In Advances in Neural Information Processing Systems, pp. 2378–2386, 2016.
  27. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), December 2015.
  28. Structured and efficient variational deep learning with matrix gaussian posteriors. In International Conference on Machine Learning, pp. 1708–1716, 2016.
  29. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. Advances in Neural Information Processing Systems, 35:5775–5787, 2022.
  30. Repaint: Inpainting using denoising diffusion probabilistic models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  11461–11471, 2022.
  31. Antigen-specific antibody design and optimization with diffusion-based generative models for protein structures. Advances in Neural Information Processing Systems, 35:9754–9767, 2022.
  32. David John Cameron Mackay. Bayesian methods for adaptive models. PhD thesis, California Institute of Technology, 1992.
  33. A simple baseline for bayesian uncertainty in deep learning. Advances in neural information processing systems, 32, 2019.
  34. A scalable laplace approximation for neural networks. In 6th International Conference on Learning Representations, volume 6. International Conference on Representation Learning, 2018.
  35. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  10684–10695, 2022.
  36. Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, 35:36479–36494, 2022.
  37. Improved techniques for training gans. Advances in neural information processing systems, 29, 2016.
  38. A hybrid convolutional variational autoencoder for text generation. arXiv preprint arXiv:1702.02390, 2017.
  39. Denoising diffusion implicit models. In International Conference on Learning Representations, 2020.
  40. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021.
  41. Topic-guided variational autoencoders for text generation. In NAACL HLT 2019-2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies-Proceedings of the Conference, pp.  166–177. Association for Computational Linguistics (ACL), 2019.
  42. Bayesian learning via stochastic gradient langevin dynamics. In International Conference on Machine Learning, pp. 681–688, 2011.
  43. Better aligning text-to-image models with human preference. arXiv e-prints, pp.  arXiv–2303, 2023.
  44. Geodiff: A geometric diffusion model for molecular conformation generation. In International Conference on Learning Representations, 2021.
  45. Text-to-image diffusion model in generative ai: A survey. arXiv preprint arXiv:2303.07909, 2023.
  46. Noisy natural gradient as variational inference. In International Conference on Machine Learning, pp. 5847–5856, 2018.
  47. Cyclical stochastic gradient mcmc for bayesian deep learning. arXiv preprint arXiv:1902.03932, 2019.
Citations (2)

Summary

We haven't generated a summary for this paper yet.