Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Solving Linear Inverse Problems Provably via Posterior Sampling with Latent Diffusion Models (2307.00619v1)

Published 2 Jul 2023 in cs.LG, cs.AI, and stat.ML

Abstract: We present the first framework to solve linear inverse problems leveraging pre-trained latent diffusion models. Previously proposed algorithms (such as DPS and DDRM) only apply to pixel-space diffusion models. We theoretically analyze our algorithm showing provable sample recovery in a linear model setting. The algorithmic insight obtained from our analysis extends to more general settings often considered in practice. Experimentally, we outperform previously proposed posterior sampling algorithms in a wide variety of problems including random inpainting, block inpainting, denoising, deblurring, destriping, and super-resolution.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. Brian D.O. Anderson “Reverse-time diffusion equation models” In Stochastic Processes and their Applications 12.3 Elsevier, 1982, pp. 313–326
  2. “Single-Shot Adaptation using Score-Based Models for MRI Reconstruction” In International Society for Magnetic Resonance in Medicine, Annual Meeting, 2022
  3. “Cold Diffusion: Inverting arbitrary image transforms without noise” In arXiv preprint arXiv:2208.09392, 2022
  4. “Align your latents: High-resolution video synthesis with latent diffusion models” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 22563–22575
  5. “Compressed sensing using generative models” In International Conference on Machine Learning, 2017, pp. 537–546 PMLR
  6. Stanley H Chan, Xiran Wang and Omar A Elgendy “Plug-and-play ADMM for image restoration: Fixed-point convergence and applications” In IEEE Transactions on Computational Imaging 3.1 IEEE, 2016, pp. 84–98
  7. “Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data” In arXiv preprint arXiv:2302.07194, 2023
  8. “Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions” In arXiv preprint arXiv:2209.11215, 2022
  9. Sitan Chen, Giannis Daras and Alexandros G Dimakis “Restoration-Degradation Beyond Linear Diffusions: A Non-Asymptotic Analysis For DDIM-Type Samplers” In arXiv preprint arXiv:2303.03384, 2023
  10. “Ilvr: Conditioning method for denoising diffusion probabilistic models” In arXiv preprint arXiv:2108.02938, 2021
  11. “Diffusion Posterior Sampling for General Noisy Inverse Problems” In The Eleventh International Conference on Learning Representations, 2023 URL: https://openreview.net/forum?id=OnD9zGAGT0k
  12. Hyungjin Chung, Jeongsol Kim and Jong Chul Ye “Direct Diffusion Bridge using Data Consistency for Inverse Problems” In arXiv preprint arXiv:2305.19809, 2023
  13. “Improving Diffusion Models for Inverse Problems using Manifold Constraints” In Advances in Neural Information Processing Systems, 2022 URL: https://openreview.net/forum?id=nJJjv0JDJju
  14. “Score-guided intermediate layer optimization: Fast langevin mixing for inverse problem” In arXiv preprint arXiv:2206.09104, 2022
  15. “Soft diffusion: Score matching for general corruptions” In arXiv preprint arXiv:2209.05442, 2022
  16. “Inversion by direct iteration: An alternative to denoising diffusion for image restoration” In arXiv preprint arXiv:2303.11435, 2023
  17. “Imagenet: A large-scale hierarchical image database” In 2009 IEEE conference on computer vision and pattern recognition, 2009, pp. 248–255 Ieee
  18. “Diffusion models beat gans on image synthesis” In Advances in Neural Information Processing Systems 34, 2021, pp. 8780–8794
  19. “DataComp: In search of the next generation of multimodal datasets” In arXiv preprint arXiv:2304.14108, 2023
  20. Jonathan Ho, Ajay Jain and Pieter Abbeel “Denoising diffusion probabilistic models” In Advances in Neural Information Processing Systems 33, 2020, pp. 6840–6851
  21. “Estimation of non-normalized statistical models by score matching.” In Journal of Machine Learning Research 6.4, 2005
  22. “Robust compressed sensing mri with deep generative priors” In Advances in Neural Information Processing Systems 34, 2021, pp. 14938–14954
  23. “Instance-optimal compressed sensing via posterior sampling” In arXiv preprint arXiv:2106.11438, 2021
  24. “Fairness for Image Generation with Uncertain Sensitive Attributes” In Proceedings of the 38th International Conference on Machine Learning 139, Proceedings of Machine Learning Research PMLR, 2021, pp. 4721–4732 URL: https://proceedings.mlr.press/v139/jalal21b.html
  25. Tero Karras, Samuli Laine and Timo Aila “A style-based generator architecture for generative adversarial networks” In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4401–4410
  26. “Denoising Diffusion Restoration Models” In Advances in Neural Information Processing Systems
  27. “GSURE-Based Diffusion Model Training with Corrupted Data” In arXiv preprint arXiv:2305.13128, 2023
  28. “Soft truncation: A universal training technique of score-based diffusion model for high precision score estimation” In International Conference on Machine Learning, 2022, pp. 11201–11228 PMLR
  29. “Audioldm: Text-to-audio generation with latent diffusion models” In arXiv preprint arXiv:2301.12503, 2023
  30. “Coherent Semantic Attention for Image Inpainting” In 2019 IEEE/CVF International Conference on Computer Vision (ICCV) IEEE, 2019 DOI: 10.1109/iccv.2019.00427
  31. “Repaint: Inpainting using denoising diffusion probabilistic models” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11461–11471
  32. Gary Mataev, Peyman Milanfar and Michael Elad “DeepRED: Deep image prior powered by RED” In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2019, pp. 0–0
  33. “Pulse: Self-supervised photo upsampling via latent space exploration of generative models” In Proceedings of the ieee/cvf conference on computer vision and pattern recognition, 2020, pp. 2437–2445
  34. “Deep learning techniques for inverse problems in imaging” In IEEE Journal on Selected Areas in Information Theory 1.1 IEEE, 2020, pp. 39–56
  35. “Context encoders: Feature learning by inpainting” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2536–2544
  36. “Brain imaging generation with latent diffusion models” In Deep Generative Models: Second MICCAI Workshop, DGM4MICCAI 2022, Held in Conjunction with MICCAI 2022, Singapore, September 22, 2022, Proceedings, 2022, pp. 117–126 Springer
  37. “Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation” In arXiv preprint arXiv:2008.00951, 2020
  38. Yaniv Romano, Michael Elad and Peyman Milanfar “The little engine that could: Regularization by denoising (RED)” In SIAM Journal on Imaging Sciences 10.4 SIAM, 2017, pp. 1804–1844
  39. “High-resolution image synthesis with latent diffusion models” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10684–10695
  40. “A Theoretical Justification for Image Inpainting using Denoising Diffusion Probabilistic Models” In arXiv preprint arXiv:2302.01217, 2023
  41. “LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs”, 2021 arXiv:2111.02114 [cs.CV]
  42. “LAION-5B: An open large-scale dataset for training next generation image-text models”, 2022 arXiv:2210.08402 [cs.CV]
  43. “Pseudoinverse-guided diffusion models for inverse problems” In International Conference on Learning Representations, 2023
  44. “Generative modeling by estimating gradients of the data distribution” In Advances in Neural Information Processing Systems 32, 2019
  45. “Improved techniques for training score-based generative models” In Advances in neural information processing systems 33, 2020, pp. 12438–12448
  46. “Score-Based Generative Modeling through Stochastic Differential Equations” In International Conference on Learning Representations, 2021
  47. “Score-Based Generative Modeling through Stochastic Differential Equations” In International Conference on Learning Representations
  48. “High-resolution image reconstruction with latent diffusion models from human brain activity” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 14453–14463
  49. Singanallur V Venkatakrishnan, Charles A Bouman and Brendt Wohlberg “Plug-and-play priors for model based reconstruction” In 2013 IEEE Global Conference on Signal and Information Processing, 2013, pp. 945–948 IEEE
  50. Pascal Vincent “A connection between score matching and denoising autoencoders” In Neural computation 23.7 MIT Press, 2011, pp. 1661–1674
  51. “Imagen editor and editbench: Advancing and evaluating text-guided image inpainting” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 18359–18369
  52. “Free-Form Image Inpainting With Gated Convolution” In 2019 IEEE/CVF International Conference on Computer Vision (ICCV) IEEE, 2019 DOI: 10.1109/iccv.2019.00457
Citations (70)

Summary

  • The paper introduces PSLD, a novel algorithm using latent diffusion models to achieve provable sample recovery in linear inverse problems.
  • It leverages foundation models like Stable Diffusion, replacing computationally expensive pixel-space methods with an efficient latent-space approach.
  • Extensive experiments on FFHQ, ImageNet, and web images demonstrate superior performance in tasks such as inpainting, denoising, and super-resolution.

Solving Linear Inverse Problems with Latent Diffusion Models: A New Approach

Introduction

In a significant stride toward enhancing the utility of generative models for solving linear inverse problems, recent research presents a novel framework that utilizes latent diffusion models (LDMs). This new approach, pioneered for the first time, allows for leveraging the robust image priors encapsulated in pre-trained LDMs like Stable Diffusion for tasks including inpainting, denoising, and super-resolution, among others. Previously established methodologies were confined to pixel-space diffusion models, thus limiting their application scope. Through comprehensive theoretical analysis and experimental validation, this work not only amplifies the efficacy of using LDMs but also sets new benchmarks across various problem settings.

Theoretical Analysis of the Framework

The research furnishes a detailed theoretical investigation into the algorithm’s performance within a linear model context, elucidating its capability for provable sample recovery in such settings. This analysis paves the way to extend the understanding and applicability of the proposed algorithm to more complex, real-world scenarios often encountered in practice. The novel algorithm, labeled as Posterior Sampling with Latent Diffusion (PSLD), leverages the extensive data and computational investments inherent in foundation models, like Stable Diffusion, thus circumventing the need for task-specific finetuning.

Algorithm and Implementation Details

Central to the PSLD algorithm is the use of latent spaces for diffusion processes. This distinctive approach not only circumvents the high-dimensional challenges associated with pixel-space diffusion methods but also harnesses the powerful priors of pre-trained foundation models. The methodological ingenuity of PSLD lies in its modified objective function that incorporates 'goodness' and 'gluing' adjustments, guiding the diffusion process towards optimal sample recovery. This innovation marks a departure from previous posterior sampling strategies and demonstrates superior performance through extensive simulations.

Experimental Results and Benchmarking

The experimental endeavors encompass both in-distribution data (FFHQ dataset) and out-of-distribution samples (ImageNet and web images), showcasing the algorithm’s robustness and scalability. PSLD consistently outperforms existing posterior sampling algorithms across a range of inverse problems, including various inpainting tasks, super-resolution, and denoising. Notably, the use of the Stable Diffusion model as a foundational generative model significantly contributed to achieving state-of-the-art results, underlining the potential of leveraging large-scale pre-trained models for solving inverse problems.

Practical Implications and Future Perspectives

The introduction of PSLD ushers in a new era for applying latent diffusion models to a broad spectrum of inverse problems. This approach not only expands the utility of LDMs beyond conventional generative tasks but also significantly reduces the computational overhead of model finetuning for specific tasks. Looking forward, the adaptability of PSLD to incorporate evolving foundation models presents a promising avenue for continual improvement. Furthermore, while the current focus is on linear inverse problems, extending this framework to tackle non-linear inverse challenges represents an intriguing direction for future research.

Conclusion

The presented work marks a milestone in the utilization of latent diffusion models for solving linear inverse problems. Through rigorous theoretical analysis and impressive experimental results, this paper not only extends the frontiers of generative modeling applications but also opens up new pathways for future advancements in the field. The PSLD algorithm stands as a testament to the potential of latent diffusion models in surmounting the challenges associated with linear inverse problems, heralding a promising avenue for research and development in generative AI.

Github Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com