Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Self-Adaptive Reality-Guided Diffusion for Artifact-Free Super-Resolution (2403.16643v1)

Published 25 Mar 2024 in eess.IV and cs.CV

Abstract: Artifact-free super-resolution (SR) aims to translate low-resolution images into their high-resolution counterparts with a strict integrity of the original content, eliminating any distortions or synthetic details. While traditional diffusion-based SR techniques have demonstrated remarkable abilities to enhance image detail, they are prone to artifact introduction during iterative procedures. Such artifacts, ranging from trivial noise to unauthentic textures, deviate from the true structure of the source image, thus challenging the integrity of the super-resolution process. In this work, we propose Self-Adaptive Reality-Guided Diffusion (SARGD), a training-free method that delves into the latent space to effectively identify and mitigate the propagation of artifacts. Our SARGD begins by using an artifact detector to identify implausible pixels, creating a binary mask that highlights artifacts. Following this, the Reality Guidance Refinement (RGR) process refines artifacts by integrating this mask with realistic latent representations, improving alignment with the original image. Nonetheless, initial realistic-latent representations from lower-quality images result in over-smoothing in the final output. To address this, we introduce a Self-Adaptive Guidance (SAG) mechanism. It dynamically computes a reality score, enhancing the sharpness of the realistic latent. These alternating mechanisms collectively achieve artifact-free super-resolution. Extensive experiments demonstrate the superiority of our method, delivering detailed artifact-free high-resolution images while reducing sampling steps by 2X. We release our code at https://github.com/ProAirVerse/Self-Adaptive-Guidance-Diffusion.git.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. 2012.
  2. To learn image super-resolution, use a gan to learn how to do image degradation first. In ECCV, 2018.
  3. Learning continuous image representation with local implicit image function. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8628–8638, 2021.
  4. Ilvr: Conditioning method for denoising diffusion probabilistic models. in 2021 ieee. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 14347–14356, 2021.
  5. Come-closer-diffuse-faster: Accelerating conditional diffusion models for inverse problems through stochastic contraction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12413–12422, 2022.
  6. Diffusion posterior sampling for general noisy inverse problems. In International Conference on Learning Representations (ICLR), 2023.
  7. On the detection of digital face manipulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5781–5790, 2020.
  8. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 38(2):295–307, 2015.
  9. Accelerating the super-resolution convolutional neural network. In Proceedings of the European Conference on Computer Vision (ECCV), pages 391–407. Springer, 2016.
  10. Watch your up-convolution: Cnn based generative deep neural networks are failing to reproduce spectral distributions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7890–7899, 2020.
  11. Fourier spectrum discrepancies in deep network generated images. Advances in Neural Information Processing Systems (NIPS), 33:3022–3032, 2020.
  12. Lar-sr: A local autoregressive model for image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1909–1918, 2022.
  13. Denoising diffusion probabilistic models. Advances in neural information processing systems (NIPS), 33:6840–6851, 2020.
  14. Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5197–5206, 2015.
  15. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the European Conference on Computer Vision (ECCV), pages 694–711, 2016.
  16. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4681–4690, 2017.
  17. Srdiff: Single image super-resolution with diffusion probabilistic models. Neurocomputing, 479:47–59, 2022.
  18. Swinir: Image restoration using wwin transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 1833–1844, 2021a.
  19. Hierarchical conditional flow: A unified framework for image super-resolution and image rescaling. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 4076–4085, 2021b.
  20. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 136–144, 2017.
  21. Global texture enhancement for fake face detection in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8060–8069, 2020.
  22. Srflow: Learning the super-resolution space with normalizing flow. In Proceedings of the European Conference on Computer Vision (ECCV), pages 715–732, 2020.
  23. Normalizing flow as a flexible fidelity objective for photo-realistic super-resolution. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 1756–1765, 2022.
  24. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 416–423, 2001.
  25. Sketch-based manga retrieval using manga109 dataset. Multimedia Tools and Applications, 76:21811–21838, 2017.
  26. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, 2022.
  27. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention (MICCAI), pages 234–241, 2015.
  28. Image super-resolution via iterative refinement. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 45(4):4713–4726, 2022.
  29. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1874–1883, 2016.
  30. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning (ICLR), pages 2256–2265, 2015.
  31. Generative modeling by estimating gradients of the data distribution. Advances in Neural Information Processing Systems (NIPS), 32, 2019.
  32. Exploiting diffusion prior for real-world image super-resolution. arXiv preprint arXiv:2305.07015, 2023a.
  33. A survey of deep face restoration: Denoise, super-resolution, deblur, artifact removal. arXiv:2211.02831, 2022.
  34. Esrgan: Enhanced super-resolution generative adversarial networks. In Proceedings of the European Conference on Computer Vision (ECCV) workshops, pages 0–0, 2018.
  35. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing (TIP), 13(4):600–612, 2004.
  36. Dr2: Diffusion-based robust degradation remover for blind face restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1704–1713, 2023b.
  37. Pixel-aware stable diffusion for realistic image super-resolution and personalized stylization. arXiv preprint arXiv:2308.14469, 2023.
  38. Drfn: Deep recurrent fusion network for single-image super-resolution with large factors. IEEE Transactions on Multimedia (TMM), 21(2):328–337, 2018.
  39. Attributing fake images to gans: Learning and analyzing gan fingerprints. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 7556–7566, 2019.
  40. Difface: Blind face restoration with diffused error contraction. arXiv preprint arXiv:2212.06512, 2022.
  41. On single image scale-up using sparse-representations. In International Conference on Curves and Surfaces, pages 711–730. Springer, 2012.
  42. Edface-celeb-1 m: Benchmarking face hallucination with a million-scale dataset. TPAMI, 2022.
  43. Perceptual artifacts localization for image synthesis tasks. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 7579–7590, 2023.
  44. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 586–595, 2018a.
  45. Ranksrgan: Generative adversarial networks with ranker for image super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 3096–3105, 2019a.
  46. Detecting and simulating artifacts in gan fake images. In IEEE international Workshop on Information Forensics and Security (WIFS), pages 1–6, 2019b.
  47. Image super-resolution using very deep residual channel attention networks. In Proceedings of the European Conference on Computer Vision (ECCV), pages 286–301, 2018b.
  48. Residual dense network for image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2472–2481, 2018c.
  49. Mr image super-resolution with squeeze and excitation reasoning attention network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 13425–13434, 2021.
  50. Multi-attentional deepfake detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2185–2194, 2021.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com