Multi-Scale Representation Learning for Image Restoration with State-Space Model (2408.10145v1)
Abstract: Image restoration endeavors to reconstruct a high-quality, detail-rich image from a degraded counterpart, which is a pivotal process in photography and various computer vision systems. In real-world scenarios, different types of degradation can cause the loss of image details at various scales and degrade image contrast. Existing methods predominantly rely on CNN and Transformer to capture multi-scale representations. However, these methods are often limited by the high computational complexity of Transformers and the constrained receptive field of CNN, which hinder them from achieving superior performance and efficiency in image restoration. To address these challenges, we propose a novel Multi-Scale State-Space Model-based (MS-Mamba) for efficient image restoration that enhances the capacity for multi-scale representation learning through our proposed global and regional SSM modules. Additionally, an Adaptive Gradient Block (AGB) and a Residual Fourier Block (RFB) are proposed to improve the network's detail extraction capabilities by capturing gradients in various directions and facilitating learning details in the frequency domain. Extensive experiments on nine public benchmarks across four classic image restoration tasks, image deraining, dehazing, denoising, and low-light enhancement, demonstrate that our proposed method achieves new state-of-the-art performance while maintaining low computational complexity. The source code will be publicly available.
- A high-quality denoising dataset for smartphone cameras. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1692–1700.
- LYT-Net: Lightweight YUV Transformer-based Network for Low-Light Image Enhancement. arXiv preprint arXiv:2401.15204.
- Retinexformer: One-stage retinex-based transformer for low-light image enhancement. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 12504–12513.
- Deep Retinex Decomposition for Low-Light Enhancement. In British Machine Vision Conference.
- Learning a Sparse Transformer Network for Effective Image Deraining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 5896–5905.
- Focal network for image restoration. In Proceedings of the IEEE/CVF international conference on computer vision, 13001–13011.
- Removing rain from single images via a deep detail network. In Proceedings of the IEEE conference on computer vision and pattern recognition, 3855–3863.
- Rain streak removal via dual graph convolutional network. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, 1352–1360.
- Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752.
- Image dehazing transformer with transmission-aware 3d position embedding. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 5812–5820.
- MambaIR: A Simple Baseline for Image Restoration with State-Space Model. arXiv preprint arXiv:2402.15648.
- Latent Degradation Representation Constraint for Single Image Deraining. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 3155–3159. IEEE.
- Multi-scale progressive fusion network for single image deraining. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 8346–8355.
- Dformer: Learning Efficient Image Restoration with Perceptual Guidance. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6363–6372.
- Benchmarking single-image dehazing and beyond. IEEE Transactions on Image Processing, 28(1): 492–505.
- An Efficient Single Image De-Raining Model With Decoupled Deep Networks. IEEE Transactions on Image Processing, 33: 69–81.
- Recurrent squeeze-and-excitation context aggregation net for single image deraining. In Proceedings of the European conference on computer vision (ECCV), 254–269.
- Rain streak removal using layer priors. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2736–2744.
- PointMamba: A Simple State Space Model for Point Cloud Analysis. arXiv preprint arXiv:2402.10739.
- Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF international conference on computer vision, 1833–1844.
- Retinex-inspired unrolling with cooperative prior architecture search for low-light image enhancement. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 10561–10570.
- Vmamba: Visual state space model. arXiv preprint arXiv:2401.10166.
- Deep residual fourier transformation for single image deblurring. arXiv preprint arXiv:2111.11745, 2(3): 5.
- MPRNet: Multi-path residual network for lightweight image super resolution. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2704–2713.
- Dynamic attentive graph learning for image restoration. In Proceedings of the IEEE/CVF international conference on computer vision, 4328–4337.
- Lightweight Adaptive Feature De-drifting for Compressed Image Classification. IEEE Transactions on Multimedia.
- Ensemble single image deraining network via progressive structural boosting constraints. Signal Processing: Image Communication, 99: 116460.
- Cumulative rain density sensing network for single image derain. IEEE Signal Processing Letters, 27: 406–410.
- Double domain guided real-time low-light image enhancement for ultra-high-definition transportation surveillance. IEEE Transactions on Intelligent Transportation Systems.
- Adaptive consistency prior based deep network for image denoising. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 8596–8606.
- Progressive image deraining networks: A better and simpler baseline. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 3937–3946.
- Mutual information-driven triple interaction network for efficient image dehazing. In Proceedings of the 31st ACM International Conference on Multimedia, 7–16.
- Variational Deep Image Restoration. IEEE Transactions on Image Processing, 31: 4363–4376.
- Vision transformers for single image dehazing. IEEE Transactions on Image Processing, 32: 1927–1941.
- Maxim: Multi-axis mlp for image processing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 5769–5780.
- A model-driven deep neural network for single image rain removal. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 3103–3112.
- Uformer: A general u-shaped transformer for image restoration. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 17683–17693.
- Dual residual attention network for image denoising. Pattern Recognition, 149: 110291.
- Image de-raining transformer. IEEE Transactions on Pattern Analysis and Machine Intelligence.
- SNR-aware low-light image enhancement. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 17714–17724.
- Deep joint rain detection and removal from a single image. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1357–1366.
- Sparse gradient regularized deep retinex network for robust low-light image enhancement. IEEE Transactions on Image Processing, 30: 2072–2086.
- Perceiving and modeling density for image dehazing. In European conference on computer vision, 130–145. Springer.
- Structure-preserving deraining with residue channel prior guidance. In Proceedings of the IEEE/CVF international conference on computer vision, 4238–4247.
- Multiscale depth fusion with contextual hybrid enhancement network for image dehazing. IEEE Transactions on Instrumentation and Measurement.
- Frequency and spatial dual guidance for image dehazing. In European Conference on Computer Vision, 181–198. Springer.
- Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 5728–5739.
- CycleISP: Real Image Restoration via Improved Data Synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- Learning enriched features for real image restoration and enhancement. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXV 16, 492–511. Springer.
- SmartRainNet: Uncertainty Estimation For Laser Measurement in Rain. In 2023 IEEE International Conference on Robotics and Automation (ICRA), 10567–10573. IEEE.
- Density-aware single image de-raining using a multi-stream dense network. In Proceedings of the IEEE conference on computer vision and pattern recognition, 695–704.
- Vm-unet-v2 rethinking vision mamba unet for medical image segmentation. arXiv preprint arXiv:2403.09157.
- Kindling the Darkness: A Practical Low-light Image Enhancer. In Proceedings of the 27th ACM International Conference on Multimedia, 1632–1640.
- U-shaped Vision Mamba for Single Image Dehazing. arXiv:2402.04139.
- FreqMamba: Viewing Mamba from a Frequency Perspective for Image Deraining. In ACM Multimedia 2024.