Distilling Semantic Priors from SAM to Efficient Image Restoration Models (2403.16368v2)
Abstract: In image restoration (IR), leveraging semantic priors from segmentation models has been a common approach to improve performance. The recent segment anything model (SAM) has emerged as a powerful tool for extracting advanced semantic priors to enhance IR tasks. However, the computational cost of SAM is prohibitive for IR, compared to existing smaller IR models. The incorporation of SAM for extracting semantic priors considerably hampers the model inference efficiency. To address this issue, we propose a general framework to distill SAM's semantic knowledge to boost exiting IR models without interfering with their inference process. Specifically, our proposed framework consists of the semantic priors fusion (SPF) scheme and the semantic priors distillation (SPD) scheme. SPF fuses two kinds of information between the restored image predicted by the original IR model and the semantic mask predicted by SAM for the refined restored image. SPD leverages a self-distillation manner to distill the fused semantic priors to boost the performance of original IR models. Additionally, we design a semantic-guided relation (SGR) module for SPD, which ensures semantic feature representation space consistency to fully distill the priors. We demonstrate the effectiveness of our framework across multiple IR models and tasks, including deraining, deblurring, and denoising.
- Simple baselines for image restoration. In European Conference on Computer Vision, pages 17–33. Springer, 2022.
- Tracking anything with decoupled video segmentation. In ICCV, pages 1316–1326, 2023.
- The cityscapes dataset for semantic urban scene understanding. In CVPR, pages 3213–3223, 2016.
- Dfvsr: directional frequency video super-resolution via asymmetric and enhancement alignment network. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, pages 681–689, 2023.
- Enhanced image deblurring: An efficient frequency exploitation and preservation network. In Proceedings of the 31st ACM International Conference on Multimedia, pages 7184–7193, 2023.
- Instances as queries. In ICCV, pages 6910–6919, 2021.
- Removing rain from single images via a deep detail network. In CVPR, pages 3855–3863, 2017.
- Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Transactions on pattern analysis and machine intelligence, (6):721–741, 1984.
- 3dsam-adapter: Holistic adaptation of sam from 2d to 3d for promptable medical image segmentation. arXiv preprint arXiv:2306.13465, 2023.
- Efficientderain: Learning pixel-wise dilation filtering for high-efficiency single-image deraining. In AAAI, volume 35, pages 1487–1495, 2021.
- Toward convolutional blind denoising of real photographs. In CVPR, pages 1712–1722, 2019.
- Mask r-cnn. In ICCV, pages 2961–2969, 2017.
- Single image haze removal using dark channel prior. IEEE transactions on pattern analysis and machine intelligence, 33(12):2341–2353, 2010.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
- Scope of validity of psnr in image/video quality assessment. Electronics letters, 44(13):800–801, 2008.
- Semask: Semantically masked transformers for semantic segmentation. In ICCV, pages 752–761, 2023.
- Multi-scale progressive fusion network for single image deraining. In CVPR, pages 8346–8355, 2020.
- Let segment anything help image dehaze. arXiv preprint arXiv:2306.15870, 2023.
- Panoptic segmentation. In CVPR, pages 9404–9413, 2019.
- Segment anything. In ICCV, 2023.
- Deep photo: Model-based photograph enhancement and viewing. ACM transactions on graphics (TOG), 27(5):1–10, 2008.
- Sinddm: A single image denoising diffusion model. In ICML, pages 17920–17930. PMLR, 2023.
- Deblurgan: Blind motion deblurring using conditional adversarial networks. In CVPR, pages 8183–8192, 2018.
- Deblurgan-v2: Deblurring (orders-of-magnitude) faster and better. In ICCV, pages 8878–8887, 2019.
- Learning degradation representations for image deblurring. In ECCV, pages 736–753. Springer, 2022.
- Mask dino: Towards a unified transformer-based framework for object detection and segmentation. In CVPR, pages 3041–3050, 2023.
- Real-world deep local motion deblurring. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 1314–1322, 2023.
- Sam-deblur: Let segment anything boost image deblurring. arXiv preprint arXiv:2309.02270, 2023.
- Recurrent squeeze-and-excitation context aggregation net for single image deraining. In ECCV, pages 254–269, 2018.
- Close the loop: a unified bottom-up and top-down paradigm for joint image deraining and segmentation. In AAAI, volume 36, pages 1438–1446, 2022.
- When image denoising meets high-level vision tasks: A deep learning approach. arXiv preprint arXiv:1706.04284, 2017.
- Efficient biomedical instance segmentation via knowledge distillation. In MICCAI. Springer, 2022.
- Learning cross-representation affinity consistency for sparsely supervised biomedical instance segmentation. In ICCV, 2023.
- Graph relation distillation for efficient biomedical instance segmentation. arXiv preprint arXiv:2401.06370, 2024.
- Fully convolutional networks for semantic segmentation. In CVPR, pages 3431–3440, 2015.
- Can sam boost video super-resolution? arXiv preprint arXiv:2305.06524, 2023.
- Deep multi-scale convolutional neural network for dynamic scene deblurring. In CVPR, pages 3883–3891, 2017.
- Multi-temporal recurrent neural networks for progressive non-uniform single image deblurring with incremental temporal training. In ECCV, pages 327–343. Springer, 2020.
- Revisiting self-distillation. arXiv preprint arXiv:2206.08491, 2022.
- Progressive image deraining networks: A better and simpler baseline. In CVPR, pages 3937–3946, 2019.
- Deep video dehazing with semantic segmentation. IEEE transactions on image processing, 28(4):1895–1908, 2018.
- U-net: Convolutional networks for biomedical image segmentation. In MICCAI, pages 234–241. Springer, 2015.
- Continuous dice coefficient: a method for evaluating probabilistic segmentations. arXiv preprint arXiv:1906.11031, 2019.
- Exploiting semantics for face image deblurring. International Journal of Computer Vision, 128:1829–1846, 2020.
- High-resolution representations for labeling pixels and regions. arXiv preprint arXiv:1904.04514, 2019.
- Scale-recurrent network for deep image deblurring. In CVPR, pages 8174–8182, 2018.
- A model-driven deep neural network for single image rain removal. In CVPR, pages 3103–3112, 2020.
- Dual super-resolution learning for semantic segmentation. In CVPR, pages 3774–3783, 2020.
- Recovering realistic texture in image super-resolution by deep spatial feature transform. In CVPR, pages 606–615, 2018.
- Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004.
- Uformer: A general u-shaped transformer for image restoration. In CVPR, pages 17683–17693, 2022.
- Sginet: Toward sufficient interaction between single image deraining and semantic segmentation. In Proceedings of the 30th ACM International Conference on Multimedia, pages 6202–6210, 2022.
- Learning semantic-aware knowledge guidance for low-light image enhancement. In CVPR, pages 1662–1671, 2023.
- A dive into sam prior in image restoration. arXiv preprint arXiv:2305.13620, 2023.
- Edit everything: A text-guided generative system for images editing. arXiv preprint arXiv:2304.14006, 2023.
- Upsnet: A unified panoptic segmentation network. In CVPR, pages 8818–8826, 2019.
- Track anything: Segment anything meets videos. arXiv preprint arXiv:2304.11968, 2023.
- Deep joint rain detection and removal from a single image. In CVPR, pages 1357–1366, 2017.
- Restormer: Efficient transformer for high-resolution image restoration. In CVPR, pages 5728–5739, 2022.
- Multi-stage progressive image restoration. In CVPR, pages 14821–14831, 2021.
- Deep stacked hierarchical multi-patch network for image deblurring. In CVPR, pages 5978–5986, 2019.
- Density-aware single image de-raining using a multi-stream dense network. In CVPR, pages 695–704, 2018.
- Plug-and-play image restoration with deep denoiser prior. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):6360–6376, 2021.
- Deblurring by realistic blurring. In CVPR, pages 2737–2746, 2020.
- Idr: Self-supervised image denoising via iterative data refinement. In CVPR, pages 2098–2107, 2022.
- Residual dense network for image restoration. IEEE transactions on pattern analysis and machine intelligence, 43(7):2480–2495, 2020.
- Pyramid scene parsing network. In CVPR, pages 2881–2890, 2017.
- Semantic-guided zero-shot learning for low-light image/video enhancement. In WACV, pages 581–590, 2022.