CD-CTFM: A Lightweight CNN-Transformer Network for Remote Sensing Cloud Detection Fusing Multiscale Features (2306.07186v1)
Abstract: Clouds in remote sensing images inevitably affect information extraction, which hinder the following analysis of satellite images. Hence, cloud detection is a necessary preprocessing procedure. However, the existing methods have numerous calculations and parameters. In this letter, a lightweight CNN-Transformer network, CD-CTFM, is proposed to solve the problem. CD-CTFM is based on encoder-decoder architecture and incorporates the attention mechanism. In the decoder part, we utilize a lightweight network combing CNN and Transformer as backbone, which is conducive to extract local and global features simultaneously. Moreover, a lightweight feature pyramid module is designed to fuse multiscale features with contextual information. In the decoder part, we integrate a lightweight channel-spatial attention module into each skip connection between encoder and decoder, extracting low-level features while suppressing irrelevant information without introducing many parameters. Finally, the proposed model is evaluated on two cloud datasets, 38-Cloud and MODIS. The results demonstrate that CD-CTFM achieves comparable accuracy as the state-of-art methods. At the same time, CD-CTFM outperforms state-of-art methods in terms of efficiency.
- K. Anderson, B. Ryan, W. Sonntag, A. Kavvada, and L. Friedl, “Earth observation in service of the 2030 agenda for sustainable development,” Geo-spatial Information Science, vol. 20, no. 2, pp. 77–96, 2017.
- S. Mahajan and B. Fataniya, “Cloud detection methodologies: variants and development—a review,” Complex & Intelligent Systems, vol. 6, no. 2, pp. 251–261, 2020.
- S. Qiu, Z. Zhu, and B. He, “Fmask 4.0: Improved cloud and cloud shadow detection in landsats 4–8 and sentinel-2 imagery,” Remote Sensing of Environment, vol. 231, p. 111205, 2019.
- Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, pp. 436–444, 2015.
- A. Francis, P. Sidiropoulos, and J.-P. Muller, “Cloudfcn: Accurate and robust cloud detection for satellite imagery with deep learning,” Remote Sensing, vol. 11, no. 19, 2019.
- Y. Guo, X. Cao, B. Liu, and M. Gao, “Cloud detection for satellite imagery using attention-based u-net convolutional neural network,” Symmetry, vol. 12, no. 6, 2020. [Online]. Available: https://www.mdpi.com/2073-8994/12/6/1056
- L. Zhang, J. Sun, X. Yang, R. Jiang, and Q. Ye, “Improving deep learning-based cloud detection for satellite images with attention mechanism,” IEEE Geoscience and Remote Sensing Letters, vol. 19, pp. 1–5, 2021.
- X. Yao, Q. Guo, and A. Li, “Light-weight cloud detection network for optical remote sensing images with attention-based deeplabv3+ architecture,” Remote Sensing, vol. 13, no. 18, 2021.
- Y. Chen, X. Dai, D. Chen, M. Liu, X. Dong, L. Yuan, and Z. Liu, “Mobile-former: Bridging mobilenet and transformer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5270–5279.
- S. Mehta, M. Rastegari, A. Caspi, L. Shapiro, and H. Hajishirzi, “Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation,” in Proceedings of the european conference on computer vision (ECCV), 2018, pp. 552–568.
- S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “Cbam: Convolutional block attention module,” in Proceedings of the European Conference on Computer Vision (ECCV), September 2018.
- P. Sermanet, S. Chintala, and Y. LeCun, “Convolutional neural networks applied to house numbers digit classification,” in Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), 2012, pp. 3288–3291.
- S. Mohajerani, T. A. Krammer, and P. Saeedi, “A Cloud Detection Algorithm for Remote Sensing Images Using Fully Convolutional Neural Networks,” in 2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP), Aug 2018, pp. 1–5.
- X. Li, X. Yang, X. Li, S. Lu, Y. Ye, and Y. Ban, “Gcdb-unet: A novel robust cloud detection approach for remote sensing images,” Knowledge-Based Systems, vol. 238, p. 107890, 2022.
- O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, 2015, pp. 234–241.
- L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution for semantic image segmentation,” in Proceedings of the European Conference on Computer Vision (ECCV), September 2018.
- M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mobilenetv2: Inverted residuals and linear bottlenecks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4510–4520.