Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
134 tokens/sec
GPT-4o
9 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Regularization by Texts for Latent Diffusion Inverse Solvers (2311.15658v3)

Published 27 Nov 2023 in cs.CV, cs.AI, and cs.LG

Abstract: The recent development of diffusion models has led to significant progress in solving inverse problems by leveraging these models as powerful generative priors. However, challenges persist due to the ill-posed nature of such problems, often arising from ambiguities in measurements or intrinsic system symmetries. To address this, here we introduce a novel latent diffusion inverse solver, regularization by text (TReg), inspired by the human ability to resolve visual ambiguities through perceptual biases. TReg integrates textual descriptions of preconceptions about the solution during reverse diffusion sampling, dynamically reinforcing these descriptions through null-text optimization, which we refer to as adaptive negation. Our comprehensive experimental results demonstrate that TReg effectively mitigates ambiguity in inverse problems, improving both accuracy and efficiency.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Textir: A simple framework for text-based editable image restoration. arXiv preprint arXiv:2302.14736, 2023.
  2. ∞\infty∞-diff: Infinite resolution diffusion with subsampled mollified states. arXiv preprint arXiv:2303.18242, 2023.
  3. Food-101 – mining discriminative components with random forests. In European Conference on Computer Vision, 2014.
  4. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine learning, 3(1):1–122, 2011.
  5. Ilvr: Conditioning method for denoising diffusion probabilistic models. arXiv preprint arXiv:2108.02938, 2021.
  6. Diffusion posterior sampling for general noisy inverse problems. arXiv preprint arXiv:2209.14687, 2022a.
  7. Come-closer-diffuse-faster: Accelerating conditional diffusion models for inverse problems through stochastic contraction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12413–12422, 2022b.
  8. Parallel diffusion models of operator and image for blind inverse problems. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6059–6069, 2023a.
  9. Fast diffusion sampler for inverse problems by geometric decomposition. arXiv preprint arXiv:2303.05754, 2023b.
  10. Solving 3d inverse problems using pre-trained 2d diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22542–22551, 2023c.
  11. Prompt-tuning latent diffusion models for inverse problems. arXiv preprint arXiv:2310.01110, 2023d.
  12. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
  13. Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34:8780–8794, 2021.
  14. Iterative reconstruction based on latent diffusion model for sparse data reconstruction. arXiv preprint arXiv:2307.12070, 2023.
  15. Delta denoising score. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2328–2337, 2023.
  16. Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598, 2022.
  17. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  18. Denoising diffusion restoration models. Advances in Neural Information Processing Systems, 35:23593–23606, 2022.
  19. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  20. Improving 3d imaging with pre-trained perpendicular 2d diffusion models. arXiv preprint arXiv:2303.08440, 2023.
  21. The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691, 2021.
  22. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
  23. Controlling vision-language models for universal image restoration. arXiv preprint arXiv:2310.01018, 2023.
  24. A variational perspective on solving inverse problems with diffusion models. arXiv preprint arXiv:2305.04391, 2023.
  25. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  26. The little engine that could: Regularization by denoising (red). SIAM Journal on Imaging Sciences, 10(4):1804–1844, 2017.
  27. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
  28. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
  29. Solving linear inverse problems provably via posterior sampling with latent diffusion models. arXiv preprint arXiv:2307.00619, 2023.
  30. Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pages 2256–2265. PMLR, 2015.
  31. Solving inverse problems with latent diffusion models via hard data consistency. arXiv preprint arXiv:2307.08123, 2023.
  32. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020a.
  33. Pseudoinverse-guided diffusion models for inverse problems. In International Conference on Learning Representations, 2022.
  34. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020b.
  35. Plug-and-play diffusion features for text-driven image-to-image translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1921–1930, 2023.
  36. Plug-and-play priors for model based reconstruction. In 2013 IEEE global conference on signal and information processing, pages 945–948. IEEE, 2013.
  37. Zero-shot image restoration using denoising diffusion null-space model. arXiv preprint arXiv:2212.00490, 2022.
Citations (11)

Summary

We haven't generated a summary for this paper yet.