Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ProRes: Exploring Degradation-aware Visual Prompt for Universal Image Restoration (2306.13653v1)

Published 23 Jun 2023 in cs.CV

Abstract: Image restoration aims to reconstruct degraded images, e.g., denoising or deblurring. Existing works focus on designing task-specific methods and there are inadequate attempts at universal methods. However, simply unifying multiple tasks into one universal architecture suffers from uncontrollable and undesired predictions. To address those issues, we explore prompt learning in universal architectures for image restoration tasks. In this paper, we present Degradation-aware Visual Prompts, which encode various types of image degradation, e.g., noise and blur, into unified visual prompts. These degradation-aware prompts provide control over image processing and allow weighted combinations for customized image restoration. We then leverage degradation-aware visual prompts to establish a controllable and universal model for image restoration, called ProRes, which is applicable to an extensive range of image restoration tasks. ProRes leverages the vanilla Vision Transformer (ViT) without any task-specific designs. Furthermore, the pre-trained ProRes can easily adapt to new tasks through efficient prompt tuning with only a few images. Without bells and whistles, ProRes achieves competitive performance compared to task-specific methods and experiments can demonstrate its ability for controllable restoration and adaptation for new tasks. The code and models will be released in \url{https://github.com/leonmakise/ProRes}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Simple baselines for image restoration. In ECCV, pages 17--33, 2022.
  2. Learning enriched features for fast image restoration and enhancement. IEEE Trans. Pattern Anal. Mach. Intell., 45(2):1934--1948, 2023.
  3. Mprnet: Multi-path residual network for lightweight image super resolution. In WACV, pages 2703--2712, 2021.
  4. Uformer: A general u-shaped transformer for image restoration. In CVPR, pages 17662--17672, 2022.
  5. All-in-one image restoration for unknown corruption. In CVPR, pages 17431--17441, 2022.
  6. Learning multiple adverse weather removal via two-stage knowledge learning and multi-contrastive regularization: Toward a unified model. In CVPR, pages 17632--17641, 2022.
  7. Path-restore: Learning network path selection for image restoration. IEEE Trans. Pattern Anal. Mach. Intell., 44(10):7078--7092, 2022.
  8. Uni-perceiver-moe: Learning sparse generalist models with conditional moes. In NeurIPS, 2022.
  9. Uni-perceiver: Pre-training unified architecture for generic perception for zero-shot and few-shot tasks. In CVPR, 2022.
  10. Masked-attention mask transformer for universal image segmentation. In CVPR, 2022.
  11. Oneformer: One transformer to rule universal image segmentation. In CoRR, volume abs/2211.06220, 2022.
  12. An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR, 2021.
  13. A high-quality denoising dataset for smartphone cameras. In CVPR, pages 1692--1700, 2018.
  14. Deep retinex decomposition for low-light enhancement. In BMVC, pages 1--12, 2018.
  15. BERT: pre-training of deep bidirectional transformers for language understanding. In NAACL, pages 4171--4186, 2019.
  16. Perceiver: General perception with iterative attention. In ICML, pages 4651--4664, 2021.
  17. Perceiver IO: A general architecture for structured inputs & outputs. In ICLR, 2022.
  18. Uni-perceiver: Pre-training unified architecture for generic perception for zero-shot and few-shot tasks. In CVPR, pages 16783--16794, 2022.
  19. UPGPT: universal diffusion model for person image generation, editing and pose transfer. CoRR, abs/2304.08870, 2023.
  20. Universal cross-domain 3d model retrieval. IEEE Trans. Multim., 23:2721--2731, 2021.
  21. Could giant pretrained image models extract universal representations? CoRR, abs/2211.02043, 2022.
  22. Towards A universal model for cross-dataset crowd counting. In ICCV, pages 3185--3194, 2021.
  23. Visual prompting via image inpainting. In NeurIPS, 2022.
  24. Images speak in images: A generalist painter for in-context visual learning. CoRR, abs/2212.02499, 2022.
  25. Learning transferable visual models from natural language supervision. In ICML, volume 139, pages 8748--8763, 2021.
  26. Open compound domain adaptation. In CVPR, pages 12807--12816, 2020.
  27. Align before fuse: Vision and language representation learning with momentum distillation. In CVPR, pages 12727--12737, 2021.
  28. Masked autoencoders are scalable vision learners. In CVPR, 2022.
  29. Deep retinex decomposition for low-light enhancement. In BMVC, page 155, 2018.
  30. Multi-stage progressive image restoration. In CVPR, pages 14821--14831, 2021.
  31. Deep multi-scale convolutional neural network for dynamic scene deblurring. In CVPR, pages 3883--3891, 2017.
  32. Hide: A hierarchical image dataset for deblurring. In CVPRW, 2019.
  33. Real-world blind image deblurring using an adaptive activation function. In ECCV, pages 3--19, 2020.
  34. Learning enriched features for real image restoration and enhancement. In ECCV, pages 3--19, 2020.
  35. Deep joint rain detection and removal from a single image. In CVPR, pages 1357--1366, 2017.
  36. Density-aware single image de-raining using a multi-stream dense network. In CVPR, pages 695--703, 2018.
  37. Multi-stage progressive image restoration. In CVPR, pages 2616--2625, 2021.
  38. Decoupled weight decay regularization. In ICLR, 2019.
  39. Deep networks with stochastic depth. In ECCV, pages 646--661, 2016.
  40. Restormer: Efficient transformer for high-resolution image restoration. In CVPR, pages 5718--5729, 2022.
  41. MAXIM: multi-axis MLP for image processing. In CVPR, pages 5759--5770, 2022.
  42. Learning photographic global tonal adjustment with a database of input / output image pairs. In CVPR, pages 97--104, 2011.
  43. NH-HAZE: an image dehazing benchmark with non-homogeneous hazy and haze-free images. In CVPR, pages 1798--1805, 2020.
  44. Two deterministic half-quadratic regularization algorithms for computed imaging. In ICIP, pages 168--172, 1994.
  45. Perceptual losses for real-time style transfer and super-resolution. In ECCV, volume 9906, pages 694--711, 2016.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Jiaqi Ma (83 papers)
  2. Tianheng Cheng (31 papers)
  3. Guoli Wang (40 papers)
  4. Qian Zhang (308 papers)
  5. Xinggang Wang (163 papers)
  6. Lefei Zhang (64 papers)
Citations (33)

Summary

ProRes: Exploring Degradation-aware Visual Prompt for Universal Image Restoration

The paper presents a novel approach to universal image restoration through a technique called ProRes, which leverages degradation-aware visual prompts. The significance of image restoration as a fundamental task within computer vision arises from its applicability in enhancing degraded images afflicted by factors such as low light, noise, blur, and rain. Traditionally, image restoration methods have been designed with task-specific architectures, which although effective, do not generalize well across different types of degradation. ProRes seeks to address these limitations by offering a versatile framework capable of handling various restoration tasks through a unified model. The methodology is centered around the integration of degradation-aware visual prompts that facilitate a controllable approach to image restoration.

Context and Motivation

Traditional image restoration tasks have typically utilized deep learning techniques with highly specialized architectures tailored for particular degradation issues—such as denoising, deraining, and deblurring. These specialized solutions lack transferability across various restoration problems, necessitating unique model designs for each task. The introduction of ProRes seeks to overcome this inefficiency by proposing a universal model capable of managing multiple image degradation challenges without task-specific modifications.

Degradation-aware Visual Prompts

The core innovation of ProRes lies in degradation-aware visual prompts, which act as parametric identifiers for distinct types of image degradation. By encoding different restorations within their respective prompts, ProRes allows for precise control over the restoration process. This system facilitates both the selection of single-task prompts and the combination of multiple prompts in a weighted manner, thereby enabling the handling of images subjected to multiple forms of degradation.

Methodology

ProRes employs a vanilla Vision Transformer (ViT) architecture without the need for bespoke designs for each restoration task. It operates by adding task-specific visual prompts directly to the input images. These prompts aid ViT in discerning the type and degree of degradation, guiding the restoration process accordingly. Importantly, the visual prompts are pre-trained with a lightweight model to ensure effective initialization before they are fine-tuned with the larger ViT architecture. One of the substantial requirements fulfilled by ProRes is the ability to adapt seamlessly to new tasks or datasets through prompt tuning, which updates only the prompts rather than the entire model.

Empirical Evaluation

The experiments conducted illustrate the competitive performance of ProRes across a host of image restoration tasks including denoising, deraining, low-light enhancement, and deblurring. The numerical results indicate that ProRes is on par with established task-specific methods, reinforcing its capability as a versatile, scalable solution for low-level vision tasks. Additionally, visualizations demonstrate the strong control ProRes offers over restoration outputs by adjusting the combination and weighting of prompts. Notably, ProRes effectively adapted to new datasets and tasks, indicating its robust transferability via prompt tuning.

Implications and Future Work

ProRes is indicative of the growing trend towards universal models in computer vision that aim to diminish the limitations associated with task specificity. By introducing degradation-aware visual prompts, the researchers set a foundation for future exploration into more sophisticated universal frameworks that could handle a broader range of vision tasks more efficiently. Further investigation could seek to enhance the control and granularity with which restoration tasks are executed by developing more sophisticated prompts or incorporating additional model architectures.

The paper provides valuable insights into the potential amalgamation of prompt learning and universal modeling to achieve both breadth and specificity in image restoration. It stands as a potential benchmark for further research efforts aimed at enhancing the scalability and adaptability of image restoration models.

Github Logo Streamline Icon: https://streamlinehq.com