Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model (2405.09873v1)

Published 16 May 2024 in cs.CV and eess.IV

Abstract: Infrared (IR) image super-resolution faces challenges from homogeneous background pixel distributions and sparse target regions, requiring models that effectively handle long-range dependencies and capture detailed local-global information. Recent advancements in Mamba-based (Selective Structured State Space Model) models, employing state space models, have shown significant potential in visual tasks, suggesting their applicability for IR enhancement. In this work, we introduce IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model, a novel Mamba-based model designed specifically for IR image super-resolution. This model enhances the restoration of context-sparse target details through its advanced dependency modeling capabilities. Additionally, a new wavelet transform feature modulation block improves multi-scale receptive field representation, capturing both global and local information efficiently. Comprehensive evaluations confirm that IRSRMamba outperforms existing models on multiple benchmarks. This research advances IR super-resolution and demonstrates the potential of Mamba-based models in IR image processing. Code are available at \url{https://github.com/yongsongH/IRSRMamba}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. P. M. Harvey et al., “First science results from sofia/forcast: Super-resolution imaging of the s140 cluster at 37 μ𝜇\muitalic_μm,” The Astrophysical Journal Letters, vol. 749, no. 2, p. L20, 2012.
  2. S. Liang et al., “Dasr: Dual-attention transformer for infrared image super-resolution,” Infrared Physics & Technology, vol. 133, p. 104837, 2023.
  3. X. Chen et al., “Modeling thermal infrared image degradation and real-world super-resolution under background thermal noise and streak interference,” IEEE Transactions on Circuits and Systems for Video Technology, 2024.
  4. T. Ma et al., “Msma-net: An infrared small target detection network by multi-scale super-resolution enhancement and multi-level attention fusion,” IEEE Transactions on Geoscience and Remote Sensing, 2023.
  5. Y. Huang, Z. Jiang, R. Lan, S. Zhang, and K. Pi, “Infrared image super-resolution via transfer learning and psrgan,” IEEE Signal Processing Letters, vol. 28, pp. 982–986, 2021.
  6. Z. Jiang et al., “Difference value network for image super-resolution,” IEEE Signal Processing Letters, vol. 28, pp. 1070–1074, 2021.
  7. D. Zhang et al., “Joint motion deblurring and super-resolution for single image using diffusion model and gan,” IEEE Signal Processing Letters, 2024.
  8. Y. Chen et al., “Efficient multi-scale cosine attention transformer for image super-resolution,” IEEE Signal Processing Letters, 2023.
  9. B. Zhou et al., “Structure and texture preserving network for real-world image super-resolution,” IEEE Signal Processing Letters, vol. 29, pp. 2173–2177, 2022.
  10. H. Yongsong et al., “Infrared image super-resolution: Systematic review, and future trends,” arXiv preprint arXiv:2212.12322, 2023.
  11. Z. Zhao et al., “Modality conversion meets super-resolution: A collaborative framework for high-resolution thermal uav image generation,” IEEE Transactions on Geoscience and Remote Sensing, 2024.
  12. A. Gu and T. Dao, “Mamba: Linear-time sequence modeling with selective state spaces,” arXiv preprint arXiv:2312.00752, 2023.
  13. H. Guo et al., “Mambair: A simple baseline for image restoration with state-space model,” arXiv preprint arXiv:2402.15648, 2024.
  14. L. Zhu et al., “Vision mamba: Efficient visual representation learning with bidirectional state space model,” arXiv preprint arXiv:2401.09417, 2024.
  15. H. Zhang et al., “A survey on visual mamba,” arXiv preprint arXiv:2404.15956, 2024.
  16. X. Wang et al., “State space model for new-generation network alternative to transformers: A survey,” arXiv preprint arXiv:2404.09516, 2024.
  17. M. Xue et al., “Low-light image enhancement via clip-fourier guided wavelet diffusion,” ACM MM, 2024.
  18. H. Jiang et al., “Low-light image enhancement with wavelet-based diffusion models,” ACM Transactions on Graphics (TOG), vol. 42, no. 6, pp. 1–14, 2023.
  19. B. Liu et al., “Mwln: Multilevel wavelet learning network for continuous-scale remote sensing image super-resolution,” IEEE Geoscience and Remote Sensing Letters, 2023.
  20. Y. Liu et al., “Vmamba: Visual state space model,” arXiv preprint arXiv:2401.10166, 2024.
  21. J. Liu et al., “Target-aware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection,” in Proceedings of the IEEE/CVF CVPR, pp. 5802–5811, 2022.
  22. Y. o. Liu, “Infrared and visible image fusion with convolutional neural networks,” INT J WAVELETS MULTI, vol. 16, no. 03, p. 1850018, 2018.
  23. Y. Zhang et al., “Infrared and visual image fusion through infrared feature extraction and visual information preservation,” INFRARED PHYS TECHN, vol. 83, pp. 227–237, 2017.
  24. F. B. Campo et al., “Multimodal stereo vision system: 3d data extraction and algorithm evaluation,” IEEE Journal of Selected Topics in Signal Processing, vol. 6, no. 5, pp. 437–446, 2012.
  25. B. Lim et al., “Enhanced deep residual networks for single image super-resolution,” in Proceedings of the IEEE CVPRW, pp. 136–144, 2017.
  26. X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and C. Change Loy, “Esrgan: Enhanced super-resolution generative adversarial networks,” in Proceedings of the European conference on computer vision (ECCV) workshops, pp. 0–0, 2018.
  27. C. Dong et al., “Accelerating the super-resolution convolutional neural network,” in ECCV, pp. 391–407, Springer, 2016.
  28. C. Ledig et al., “Photo-realistic single image super-resolution using a generative adversarial network,” in Proceedings of the IEEE/CVF CVPR, pp. 4681–4690, 2017.
  29. J. Liang et al., “Swinir: Image restoration using swin transformer,” in Proceedings of the IEEE/CVF ICCV, pp. 1833–1844, 2021.
  30. C. Dong et al., “Image super-resolution using deep convolutional networks,” IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 2, pp. 295–307, 2015.
  31. Y. Zhang et al., “Image super-resolution using very deep residual channel attention networks,” in Proceedings of the ECCV, pp. 286–301, 2018.
  32. L. Sun et al., “Shufflemixer: An efficient convnet for image super-resolution,” Advances in Neural Information Processing Systems, vol. 35, pp. 17314–17326, 2022.
  33. X. Chen et al., “Activating more pixels in image super-resolution transformer,” in Proceedings of the IEEE/CVF CVPR, pp. 22367–22377, June 2023.
  34. Z. Chen, Y. Zhang, J. Gu, L. Kong, and X. Yang, “Recursive generalization transformer for image super-resolution,” in ICLR, 2024.
Citations (1)

Summary

  • The paper introduces IRSRMamba, merging Mamba-based state-space modeling with wavelet feature modulation to address long-range dependencies and enhance sparse detail restoration in infrared images.
  • Its innovative design significantly boosts PSNR, achieving 39.33 dB at ×2 scale on benchmarks and outperforming both traditional and modern state-of-the-art methods.
  • The model’s success paves the way for advanced state-space techniques in infrared imaging, with promising applications in security and planetary exploration.

IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model

The paper "IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model" by Yongsong Huang et al. investigates the challenges associated with infrared (IR) image super-resolution (SISR) and proposes a novel approach to address them using both the Mamba-based state-space model (SSM) and wavelet transform techniques. The work demonstrates IRSRMamba's superior performance over existing methodologies in terms of managing long-range dependencies and enhancing the restoration of sparse details inherent in IR imaging.

Methodological Innovations

The proposed IRSRMamba model tackles the inherent difficulties of IR image super-resolution by leveraging a Mamba-based backbone network, which originates from structured state-space models known for their application in continuous linear time-invariant systems. This backbone is expected to efficiently capture long-range dependencies of spatial data, addressing the uniformity and sparse detail challenges found in IR images. The novel integration of a Mamba-based approach into IR image processing is a critical advancement claimed by the authors.

Additionally, the authors incorporate a wavelet transform feature modulation block to facilitate multiscale feature representation. By transforming features into the frequency domain, this method enhances both local and global information capture. The modulation of feature maps through wavelet transformation combined with different convolution operations allows IRSRMamba to significantly improve scale-specific detail capture, enabling refined restoration of sparse patterns and contexts in IR images.

Strong Numerical Results and Evaluation

The paper's extensive experimental evaluation demonstrates the effectiveness of IRSRMamba across multiple benchmarks. It outperforms traditional and recent state-of-the-art methods, such as EDSR, ESRGAN, and SwinIR, reflecting significant improvements in key performance metrics including Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM). For instance, the PSNR improvement metrics highlighted in the ablation studies position IRSRMamba as a strong alternative for IR image super-resolution. Particularly, IRSRMamba achieves a PSNR of 39.33 dB with a scale factor of ×2 on the result-A dataset, outperforming other models and improving the restoration of fine features prevalent in IR imaging contexts.

Implications and Future Directions

The research presented in this paper offers several theoretical and practical implications. The introduction of IRSRMamba demonstrates that Mamba-based models can be successfully applied beyond their conventional domains, providing a robust framework for resolving long-range dependencies in complex image datasets like those seen in infrared imaging. The wavelet transform feature modulation block creates new avenues for feature extraction and enhancement, potentially inspiring further research in other image processing applications.

In terms of future work, further exploration of Mamba models in IR image enhancement and their integration with machine learning frameworks could extend these results. These advancements could significantly affect fields such as security and planetary exploration, where infrared imaging is pivotal. Additionally, the generalization ability across various datasets suggests intriguing possibilities for the model's application to other sensory data domains requiring enhanced resolution and detail restoration.

In conclusion, this paper contributes a novel IRSR methodology that effectively combines state-of-the-art approaches to address the domain-specific challenges of IR image enhancement. Its superior performance marks a significant step forward, positioning IRSRMamba as a notable tool for future developments in AI-driven image processing technologies.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com