IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model (2405.09873v1)
Abstract: Infrared (IR) image super-resolution faces challenges from homogeneous background pixel distributions and sparse target regions, requiring models that effectively handle long-range dependencies and capture detailed local-global information. Recent advancements in Mamba-based (Selective Structured State Space Model) models, employing state space models, have shown significant potential in visual tasks, suggesting their applicability for IR enhancement. In this work, we introduce IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model, a novel Mamba-based model designed specifically for IR image super-resolution. This model enhances the restoration of context-sparse target details through its advanced dependency modeling capabilities. Additionally, a new wavelet transform feature modulation block improves multi-scale receptive field representation, capturing both global and local information efficiently. Comprehensive evaluations confirm that IRSRMamba outperforms existing models on multiple benchmarks. This research advances IR super-resolution and demonstrates the potential of Mamba-based models in IR image processing. Code are available at \url{https://github.com/yongsongH/IRSRMamba}.
- P. M. Harvey et al., “First science results from sofia/forcast: Super-resolution imaging of the s140 cluster at 37 μ𝜇\muitalic_μm,” The Astrophysical Journal Letters, vol. 749, no. 2, p. L20, 2012.
- S. Liang et al., “Dasr: Dual-attention transformer for infrared image super-resolution,” Infrared Physics & Technology, vol. 133, p. 104837, 2023.
- X. Chen et al., “Modeling thermal infrared image degradation and real-world super-resolution under background thermal noise and streak interference,” IEEE Transactions on Circuits and Systems for Video Technology, 2024.
- T. Ma et al., “Msma-net: An infrared small target detection network by multi-scale super-resolution enhancement and multi-level attention fusion,” IEEE Transactions on Geoscience and Remote Sensing, 2023.
- Y. Huang, Z. Jiang, R. Lan, S. Zhang, and K. Pi, “Infrared image super-resolution via transfer learning and psrgan,” IEEE Signal Processing Letters, vol. 28, pp. 982–986, 2021.
- Z. Jiang et al., “Difference value network for image super-resolution,” IEEE Signal Processing Letters, vol. 28, pp. 1070–1074, 2021.
- D. Zhang et al., “Joint motion deblurring and super-resolution for single image using diffusion model and gan,” IEEE Signal Processing Letters, 2024.
- Y. Chen et al., “Efficient multi-scale cosine attention transformer for image super-resolution,” IEEE Signal Processing Letters, 2023.
- B. Zhou et al., “Structure and texture preserving network for real-world image super-resolution,” IEEE Signal Processing Letters, vol. 29, pp. 2173–2177, 2022.
- H. Yongsong et al., “Infrared image super-resolution: Systematic review, and future trends,” arXiv preprint arXiv:2212.12322, 2023.
- Z. Zhao et al., “Modality conversion meets super-resolution: A collaborative framework for high-resolution thermal uav image generation,” IEEE Transactions on Geoscience and Remote Sensing, 2024.
- A. Gu and T. Dao, “Mamba: Linear-time sequence modeling with selective state spaces,” arXiv preprint arXiv:2312.00752, 2023.
- H. Guo et al., “Mambair: A simple baseline for image restoration with state-space model,” arXiv preprint arXiv:2402.15648, 2024.
- L. Zhu et al., “Vision mamba: Efficient visual representation learning with bidirectional state space model,” arXiv preprint arXiv:2401.09417, 2024.
- H. Zhang et al., “A survey on visual mamba,” arXiv preprint arXiv:2404.15956, 2024.
- X. Wang et al., “State space model for new-generation network alternative to transformers: A survey,” arXiv preprint arXiv:2404.09516, 2024.
- M. Xue et al., “Low-light image enhancement via clip-fourier guided wavelet diffusion,” ACM MM, 2024.
- H. Jiang et al., “Low-light image enhancement with wavelet-based diffusion models,” ACM Transactions on Graphics (TOG), vol. 42, no. 6, pp. 1–14, 2023.
- B. Liu et al., “Mwln: Multilevel wavelet learning network for continuous-scale remote sensing image super-resolution,” IEEE Geoscience and Remote Sensing Letters, 2023.
- Y. Liu et al., “Vmamba: Visual state space model,” arXiv preprint arXiv:2401.10166, 2024.
- J. Liu et al., “Target-aware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection,” in Proceedings of the IEEE/CVF CVPR, pp. 5802–5811, 2022.
- Y. o. Liu, “Infrared and visible image fusion with convolutional neural networks,” INT J WAVELETS MULTI, vol. 16, no. 03, p. 1850018, 2018.
- Y. Zhang et al., “Infrared and visual image fusion through infrared feature extraction and visual information preservation,” INFRARED PHYS TECHN, vol. 83, pp. 227–237, 2017.
- F. B. Campo et al., “Multimodal stereo vision system: 3d data extraction and algorithm evaluation,” IEEE Journal of Selected Topics in Signal Processing, vol. 6, no. 5, pp. 437–446, 2012.
- B. Lim et al., “Enhanced deep residual networks for single image super-resolution,” in Proceedings of the IEEE CVPRW, pp. 136–144, 2017.
- X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and C. Change Loy, “Esrgan: Enhanced super-resolution generative adversarial networks,” in Proceedings of the European conference on computer vision (ECCV) workshops, pp. 0–0, 2018.
- C. Dong et al., “Accelerating the super-resolution convolutional neural network,” in ECCV, pp. 391–407, Springer, 2016.
- C. Ledig et al., “Photo-realistic single image super-resolution using a generative adversarial network,” in Proceedings of the IEEE/CVF CVPR, pp. 4681–4690, 2017.
- J. Liang et al., “Swinir: Image restoration using swin transformer,” in Proceedings of the IEEE/CVF ICCV, pp. 1833–1844, 2021.
- C. Dong et al., “Image super-resolution using deep convolutional networks,” IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 2, pp. 295–307, 2015.
- Y. Zhang et al., “Image super-resolution using very deep residual channel attention networks,” in Proceedings of the ECCV, pp. 286–301, 2018.
- L. Sun et al., “Shufflemixer: An efficient convnet for image super-resolution,” Advances in Neural Information Processing Systems, vol. 35, pp. 17314–17326, 2022.
- X. Chen et al., “Activating more pixels in image super-resolution transformer,” in Proceedings of the IEEE/CVF CVPR, pp. 22367–22377, June 2023.
- Z. Chen, Y. Zhang, J. Gu, L. Kong, and X. Yang, “Recursive generalization transformer for image super-resolution,” in ICLR, 2024.