Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PromptIR: Prompting for All-in-One Blind Image Restoration (2306.13090v1)

Published 22 Jun 2023 in cs.CV

Abstract: Image restoration involves recovering a high-quality clean image from its degraded version. Deep learning-based methods have significantly improved image restoration performance, however, they have limited generalization ability to different degradation types and levels. This restricts their real-world application since it requires training individual models for each specific degradation and knowing the input degradation type to apply the relevant model. We present a prompt-based learning approach, PromptIR, for All-In-One image restoration that can effectively restore images from various types and levels of degradation. In particular, our method uses prompts to encode degradation-specific information, which is then used to dynamically guide the restoration network. This allows our method to generalize to different degradation types and levels, while still achieving state-of-the-art results on image denoising, deraining, and dehazing. Overall, PromptIR offers a generic and efficient plugin module with few lightweight prompts that can be used to restore images of various types and levels of degradation with no prior information on the corruptions present in the image. Our code and pretrained models are available here: https://github.com/va1shn9v/PromptIR

PromptIR: Prompting for All-in-One Blind Image Restoration

The paper "PromptIR: Prompting for All-in-One Blind Image Restoration" addresses the challenges in image restoration, focusing on the development of a unified model capable of handling diverse degradation types without prior knowledge of specific degradations. Unlike existing deep learning models, which require dedicated models for each degradation type, PromptIR introduces a novel approach utilizing prompt-based learning to achieve robust restoration for multiple degradation scenarios.

Key Contributions

PromptIR stands out by integrating a prompting mechanism that encodes degradation-specific information, which dynamically guides the restoration process. The central component of this approach is the prompt block, designed as a plug-and-play module within a restoration network. The paper details how these prompt blocks dynamically adjust feature representations, enhancing the model's generalization capabilities.

  1. Prompt-Based Framework: The authors propose a framework where degradation-conditioned prompts interact with feature representations to dynamically adapt the restoration process. This eliminates the need for pre-identifying degradation types, overcoming a significant limitation of prior models.
  2. Comprehensive Evaluation: The framework's efficacy is validated across several challenging image restoration tasks, including dehazing, deraining, and denoising. The results show that PromptIR not only outperforms existing methods but also achieves state-of-the-art performance when evaluated in both single-task and all-in-one settings.

Numerical Results

The paper presents strong numerical results demonstrating the capability of PromptIR:

  • On image dehazing tasks, PromptIR achieved a notable boost of 8.13 dB PSNR over competing all-in-one methods.
  • In the deraining task, the proposed method demonstrated a 2.13 dB improvement in PSNR.
  • For image denoising, particularly challenging noise levels like σ=50\sigma=50, PromptIR outperformed existing techniques with a gain of 0.51 dB in specific datasets.

These results underscore PromptIR's ability to generalize across various degradation types, which is a significant departure from models that rely on pre-trained, condition-specific networks.

Methodology

The prompting mechanism functions by using prompt blocks consisting of both a Prompt Generation Module (PGM) and a Prompt Interaction Module (PIM). The PGM generates input-conditioned prompts that enrich input features with degradation context. In contrast, the PIM facilitates the integration of these prompts into the restoration network. This architecture ensures adaptability to different degradation types without explicit knowledge or additional model retraining.

Implications and Future Directions

The findings from this paper have profound implications for the field of image restoration. The adoption of prompting techniques in low-level vision tasks signifies a shift towards models that can handle a wide spectrum of degradations without prior knowledge. This could significantly reduce the resource constraints associated with training and deploying multiple specialized models, particularly on mobile and edge devices.

In the future, extending PromptIR to handle even more complex degradations and testing its utility in real-world applications would be beneficial. Additionally, exploring the integration of such prompt-based techniques in other computer vision tasks could open new avenues for research in adaptive and efficient model designs.

In summary, the paper presents a detailed exploration of the PromptIR framework, offering a promising approach to overcoming existing limitations in image restoration. The proposed unified model, enhanced by prompt blocks, achieves notable performance improvements across tasks, setting a new benchmark in the domain of image restoration.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (79)
  1. Contour detection and hierarchical image segmentation. TPAMI.
  2. Layer normalization. arXiv:1607.06450.
  3. Language models are few-shot learners. arXiv:2005.14165.
  4. Dehazenet: An end-to-end system for single image haze removal. IEEE Transactions on Image Processing 25(11), 5187–5198.
  5. End-to-end object detection with transformers. In ECCV.
  6. Pre-trained image processing transformer. In CVPR, pp.  12299–12310.
  7. Pre-trained image processing transformer. In CVPR.
  8. Simple baselines for image restoration. In ECCV.
  9. Activating more pixels in image super-resolution transformer. arxiv 2022. arXiv preprint arXiv:2205.04437.
  10. Blind image super-resolution with spatially variant degradations. ACM Transactions on Graphics (TOG) 38(6), 1–13.
  11. Color image denoising via sparse 3d collaborative filtering with grouping constraint in luminance-chrominance space. In 2007 IEEE International Conference on Image Processing, Volume 1, pp.  I–313. IEEE.
  12. Multi-scale boosted dehazing network with dense feature fusion. In CVPR.
  13. Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization. TIP.
  14. Fd-gan: Generative adversarial networks with fusion-discriminator for single image dehazing. In Proceedings of the AAAI Conference on Artificial Intelligence, Volume 34, pp.  10729–10736.
  15. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
  16. A general decoupled learning framework for parameterized image operators. IEEE transactions on pattern analysis and machine intelligence 43(1), 33–47.
  17. Dynamic scene deblurring with parameter selective sharing and nested skip connections. In CVPR, pp.  3848–3856.
  18. Visual prompt tuning for test-time domain adaptation. arXiv preprint arXiv:2210.04831.
  19. Single image haze removal using dark channel prior. TPAMI.
  20. Hyperprompt: Prompt-based task-conditioning of transformers. In ICML, pp.  8678–8690. PMLR.
  21. Parameter-efficient transfer learning for nlp. In ICML, pp.  2790–2799. PMLR.
  22. Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  5197–5206.
  23. Visual prompt tuning. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXIII, pp. 709–727. Springer.
  24. Multi-scale progressive fusion network for single image deraining. In CVPR, pp.  8346–8355.
  25. Transformers in vision: A survey. ACM computing surveys (CSUR) 54(10s), 1–41.
  26. Maple: Multi-modal prompt learning. CVPR.
  27. Single-image super-resolution using sparse regression and natural image prior. TPAMI.
  28. Deep photo: Model-based photograph enhancement and viewing. ACM TOG.
  29. All-in-one image restoration for unknown corruption. In CVPR, pp.  17452–17462.
  30. Aod-net: All-in-one dehazing network. In ICCV, pp.  4770–4778.
  31. Benchmarking single-image dehazing and beyond. IEEE Transactions on Image Processing 28(1), 492–505.
  32. All in one bad weather removal using architectural search. In CVPR, pp.  3175–3185.
  33. On efficient transformer and image pre-training for low-level vision. arXiv preprint arXiv:2112.10175.
  34. Prefix-tuning: Optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190.
  35. SwinIR: Image restoration using swin transformer. In ICCV Workshops.
  36. Trident dehazing network. In CVPR Workshops.
  37. Tape: Task-agnostic prior embedding for image restoration. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XVIII, pp.  447–464. Springer.
  38. Swin transformer: Hierarchical vision transformer using shifted windows. arXiv:2103.14030.
  39. Learning the degradation distribution for blind image super-resolution. In CVPR, pp.  6063–6072.
  40. Waterloo exploration database: New challenges for image quality assessment models. TIP.
  41. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In ICCV.
  42. Nonparametric blind super-resolution. In ICCV.
  43. Clean images are hard to reblur: Exploiting the ill-posed inverse task for dynamic scene deblurring. In ICLR.
  44. Enhanced pix2pix dehazing network. In CVPR, pp.  8160–8168.
  45. Adaptive consistency prior based deep network for image denoising. In CVPR.
  46. Single image dehazing via multi-scale convolutional neural networks. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14, pp.  154–169. Springer.
  47. Single image dehazing via multi-scale convolutional neural networks with holistic edges. IJCV.
  48. Multitask prompted training enables zero-shot task generalization. arXiv preprint arXiv:2110.08207.
  49. Coda-prompt: Continual decomposed attention-based prompting for rehearsal-free continual learning. arXiv preprint arXiv:2211.13218.
  50. Visual prompt tuning for generative transfer learning. arXiv preprint arXiv:2210.00990.
  51. Image denoising using deep cnn with batch renormalization. Neural Networks.
  52. Anchored neighborhood regression for fast example-based super-resolution. In ICCV.
  53. Training data-efficient image transformers & distillation through attention. In ICML.
  54. BANet: A blur-aware attention network for dynamic scene deblurring. IEEE Transactions on Image Processing.
  55. MAXIM: Multi-axis MLP for image processing. In CVPR, pp.  5769–5780.
  56. Transweather: Transformer-based restoration of images degraded by adverse weather conditions. In CVPR, pp.  2353–2363.
  57. Attention is all you need. In NeurIPS.
  58. Uformer: A general u-shaped transformer for image restoration. arXiv:2106.03106.
  59. Multitask prompt tuning enables parameter-efficient transfer learning. arXiv preprint arXiv:2303.02861.
  60. Dualprompt: Complementary prompting for rehearsal-free continual learning. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXVI, pp.  631–648. Springer.
  61. Learning to prompt for continual learning. In CVPR, pp.  139–149.
  62. Semi-supervised transfer learning for image rain removal. In CVPR.
  63. Segformer: Simple and efficient design for semantic segmentation with transformers. arXiv:2105.15203.
  64. Learning texture transformer network for image super-resolution. In CVPR.
  65. Yasarla, R. and V. M. Patel (2019). Uncertainty guided multi-scale residual learning-using a cycle spinning cnn for single image de-raining. In CVPR.
  66. Tokens-to-token vit: Training vision transformers from scratch on imagenet. arXiv:2101.11986.
  67. Restormer: Efficient transformer for high-resolution image restoration. In CVPR.
  68. CycleISP: Real image restoration via improved data synthesis. In CVPR.
  69. Learning enriched features for real image restoration and enhancement. In ECCV.
  70. Multi-stage progressive image restoration. In CVPR.
  71. Zhang, H. and V. M. Patel (2018). Density-aware single image de-raining using a multi-stream dense network. In CVPR.
  72. Designing a practical degradation model for deep blind image super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  4791–4800.
  73. Deblurring by realistic blurring. In CVPR, pp.  2737–2746.
  74. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE transactions on image processing 26(7), 3142–3155.
  75. Learning deep CNN denoiser prior for image restoration. In CVPR.
  76. Ffdnet: Toward a fast and flexible solution for cnn-based image denoising. IEEE Transactions on Image Processing 27(9), 4608–4622.
  77. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In CVPR.
  78. Learning to prompt for vision-language models. International Journal of Computer Vision (IJCV).
  79. Deformable detr: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Vaishnav Potlapalli (3 papers)
  2. Syed Waqas Zamir (20 papers)
  3. Salman Khan (244 papers)
  4. Fahad Shahbaz Khan (225 papers)
Citations (68)
Github Logo Streamline Icon: https://streamlinehq.com