Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Variational Perspective on Solving Inverse Problems with Diffusion Models (2305.04391v2)

Published 7 May 2023 in cs.LG, cs.CV, cs.NA, math.NA, and stat.ML

Abstract: Diffusion models have emerged as a key pillar of foundation models in visual domains. One of their critical applications is to universally solve different downstream inverse tasks via a single diffusion prior without re-training for each task. Most inverse tasks can be formulated as inferring a posterior distribution over data (e.g., a full image) given a measurement (e.g., a masked image). This is however challenging in diffusion models since the nonlinear and iterative nature of the diffusion process renders the posterior intractable. To cope with this challenge, we propose a variational approach that by design seeks to approximate the true posterior distribution. We show that our approach naturally leads to regularization by denoising diffusion process (RED-Diff) where denoisers at different timesteps concurrently impose different structural constraints over the image. To gauge the contribution of denoisers from different timesteps, we propose a weighting mechanism based on signal-to-noise-ratio (SNR). Our approach provides a new variational perspective for solving inverse problems with diffusion models, allowing us to formulate sampling as stochastic optimization, where one can simply apply off-the-shelf solvers with lightweight iterates. Our experiments for image restoration tasks such as inpainting and superresolution demonstrate the strengths of our method compared with state-of-the-art sampling-based diffusion models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (51)
  1. Scalable inference in latent variable models. In Proceedings of the fifth ACM international conference on Web search and data mining, pp.  123–132, 2012.
  2. Demystifying mmd gans. arXiv preprint arXiv:1801.01401, 2018.
  3. Variational inference: A review for statisticians. Journal of the American statistical Association, 112(518):859–877, 2017.
  4. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine learning, 3(1):1–122, 2011.
  5. ILVR: Conditioning method for denoising diffusion probabilistic models. arXiv preprint arXiv:2108.02938, August 2021.
  6. Perception prioritized training of diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  11472–11481, 2022.
  7. Come-Closer-Diffuse-Faster: Accelerating conditional diffusion models for inverse problems through stochastic contraction. arXiv preprint arXiv:2112.05146, December 2021.
  8. Parallel diffusion models of operator and image for blind inverse problems. arXiv e-prints, pp.  arXiv–2211, 2022a.
  9. Diffusion posterior sampling for general noisy inverse problems. arXiv preprint arXiv:2209.14687, 2022b.
  10. Solving 3d inverse problems using pre-trained 2d diffusion models. arXiv preprint arXiv:2211.10655, 2022c.
  11. Improving diffusion models for inverse problems using manifold constraints. arXiv preprint arXiv:2206.00941, 2022d.
  12. Regularization by denoising via fixed-point projection (red-pro). SIAM Journal on Imaging Sciences, 14(3):1374–1406, 2021.
  13. Diffusion models beat gans on image synthesis. Advances in Neural Information Processing Systems, 34:8780–8794, 2021.
  14. GENIE: Higher-Order Denoising Diffusion Solvers. In Advances in Neural Information Processing Systems, 2022.
  15. Diffusion models as plug-and-play priors. arXiv preprint arXiv:2206.09012, 2022.
  16. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385, December 2015.
  17. ç. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
  18. Video diffusion models. arXiv preprint arXiv:2204.03458, 2022.
  19. Stochastic variational inference. Journal of Machine Learning Research, 2013.
  20. Robust compressed sensing mri with deep generative priors. Advances in Neural Information Processing Systems, 2021.
  21. Stochastic solutions for linear inverse problems using the prior implicit in a denoiser. Advances in Neural Information Processing Systems, 34:13242–13254, 2021.
  22. Elucidating the design space of diffusion-based generative models. In Proc. NeurIPS, 2022.
  23. Denoising diffusion restoration models. arXiv preprint arXiv:2201.11793, 2022a.
  24. Jpeg artifact correction using denoising diffusion restoration models. arXiv preprint arXiv:2209.11888, 2022b.
  25. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  26. Diffwave: A versatile diffusion model for audio synthesis. arXiv preprint arXiv:2009.09761, 2020.
  27. Stein variational gradient descent: A general purpose bayesian inference algorithm. Advances in neural information processing systems, 29, 2016.
  28. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. arXiv preprint arXiv:2206.00927, 2022.
  29. mridata.org: An open archive for sharing mri raw data. In Proc. Intl. Soc. Mag. Reson. Med, volume 26, 2018.
  30. Regularization by denoising diffusion process for mri reconstruction. In Medical Imaging with Deep Learning, short paper track, 2023.
  31. Dreamfusion: Text-to-3d using 2d diffusion. arXiv preprint arXiv:2209.14988, 2022.
  32. Regularization by denoising: Clarifications and new interpretations. IEEE transactions on computational imaging, 5(1):52–67, 2018.
  33. Stochastic backpropagation and approximate inference in deep generative models. In International conference on machine learning, pp.  1278–1286. PMLR, 2014.
  34. The little engine that could: Regularization by denoising (RED). arXiv preprint arXiv:1611.02862, November 2016.
  35. High-resolution image synthesis with latent diffusion models, 2021.
  36. ImageNet large scale visual recognition challenge. International journal of computer vision, 115(3):211–252, December 2015. ISSN 0920-5691, 1573-1405. doi: 10.1007/s11263-015-0816-y.
  37. Palette: Image-to-image diffusion models. In ACM SIGGRAPH 2022 Conference Proceedings, pp.  1–10, 2022.
  38. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning, pp.  2256–2265. PMLR, 2015.
  39. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020.
  40. Pseudoinverse-guided diffusion models for inverse problems. In International Conference on Learning Representations, 2023.
  41. Maximum likelihood training of score-based diffusion models. Advances in Neural Information Processing Systems, 34:1415–1428, 2021a.
  42. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021b.
  43. Score-based generative modeling in latent space. In Neural Information Processing Systems (NeurIPS), 2021.
  44. Plug-and-play priors for model based reconstruction. In 2013 IEEE Global Conference on Signal and Information Processing, pp.  945–948. IEEE, 2013.
  45. Pascal Vincent. A connection between score matching and denoising autoencoders. Neural computation, 23(7):1661–1674, 2011.
  46. Prolificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation. arXiv preprint arXiv:2305.16213, 2023.
  47. Tackling the generative learning trilemma with denoising diffusion GANs. In International Conference on Learning Representations (ICLR), 2022.
  48. Neurallift-360: Lifting an in-the-wild 2d photo to a 3d object with 360 {{\{{\\\backslash\deg}}\}} views. arXiv preprint arXiv:2211.16431, 2022.
  49. fastmri: An open dataset and benchmarks for accelerated mri. arXiv preprint arXiv:1811.08839, 2018.
  50. Fast sampling of diffusion models with exponential integrator. arXiv preprint arXiv:2204.13902, 2022.
  51. Sparsefusion: Distilling view-conditioned diffusion for 3d reconstruction. arXiv preprint arXiv:2212.00792, 2022.
Citations (91)

Summary

  • The paper presents a novel variational inference approach that regularizes inverse problem solutions using diffusion models.
  • It demonstrates significant improvements in image fidelity, perceptual quality, and GPU efficiency over conventional methods.
  • A signal-to-noise ratio weighting mechanism underpins efficient stochastic optimization, adapting the method to diverse restoration tasks.

A Variational Approach to Solving Inverse Problems with Diffusion Models

Introduction

The emergence of diffusion models, such as Stable Diffusion, has marked a significant advancement in the field of visual foundation models. Notably, these models serve as a robust prior for sampling in various downstream inverse problems, including image restoration and rendering. However, their application has been limited by the need for universal and adaptive samplers that do not require re-training for each task, alongside the requirement for efficiency and ease of tuning. This paper introduces a variational approach to address these challenges, offering an insightful method for regularization by denoising diffusion process (RED-diff).

Background

Diffusion models have been increasingly applied to inverse problems across diverse domains. Prior approaches have made attempts to develop universal samplers but often struggled with the intractable and multimodal nature of the posterior distribution arising from the nonlinear and recursive backward diffusion process. In response to these challenges, this research leverages variational inference to approximate true posterior distributions effectively. By adopting a principled variational perspective, the introduced method, RED-diff, not only enhances image fidelity and perceptual quality but also exhibits superior GPU efficiency.

Methodology

The core of the proposed method is to approximate the posterior distribution of data given observations through variational inference, utilizing the denoising diffusion model as the data prior and representing the measurement model as a likelihood. The approach leads to regularization by the denoising diffusion process, where denoisers at different timesteps impose structural constraints on the image. A key innovation is the introduction of a weighting mechanism based on signal-to-noise ratio (SNR) to evaluate the contribution of denoisers from different timesteps. This approach effectively formulates sampling as stochastic optimization, facilitated by the application of off-the-shelf solvers with lightweight iterates.

Experiments and Results

The research conducted extensive experiments for various linear and nonlinear image restoration tasks. The variational approach demonstrated superior quality in image fidelity and perceptual quality when compared with state-of-the-art samplers. Furthermore, the method's efficiency was highlighted through its lightweight iterates and GPU-friendly nature. Ablation studies further supported these findings, suggesting that the optimizer parameters, such as learning rate and the number of steps, are effective in tweaking the trade-off between fidelity and perceptual quality.

Implications and Future Directions

This paper's variational perspective on solving inverse problems with diffusion models opens new avenues for research and application in AI. The proposed method, RED-diff, provides a theoretically grounded and computationally efficient approach for leveraging diffusion models in solving a wide range of inverse problems. The success of the weighting mechanism based on denoising SNR presents an exciting area for future exploration, potentially leading to enhancements in the method's adaptability and performance across diverse tasks. As the field continues to evolve, further investigations into optimizing the variational distribution and exploring methods that encourage solution diversity may lead to even more versatile and effective solutions.

Conclusion

The introduction of a variational approach to leverage diffusion models for solving inverse problems represents a significant advancement in the field. By enabling regularization through the denoising diffusion process and formulating sampling as stochastic optimization, the proposed method offers both theoretical insights and practical benefits. The demonstrated superiority in image fidelity and computational efficiency highlights the potential of this approach in advancing the capabilities of AI models in visual domains and beyond.