SFFNet: A Wavelet-Based Spatial and Frequency Domain Fusion Network for Remote Sensing Segmentation (2405.01992v1)
Abstract: In order to fully utilize spatial information for segmentation and address the challenge of handling areas with significant grayscale variations in remote sensing segmentation, we propose the SFFNet (Spatial and Frequency Domain Fusion Network) framework. This framework employs a two-stage network design: the first stage extracts features using spatial methods to obtain features with sufficient spatial details and semantic information; the second stage maps these features in both spatial and frequency domains. In the frequency domain mapping, we introduce the Wavelet Transform Feature Decomposer (WTFD) structure, which decomposes features into low-frequency and high-frequency components using the Haar wavelet transform and integrates them with spatial features. To bridge the semantic gap between frequency and spatial features, and facilitate significant feature selection to promote the combination of features from different representation domains, we design the Multiscale Dual-Representation Alignment Filter (MDAF). This structure utilizes multiscale convolutions and dual-cross attentions. Comprehensive experimental results demonstrate that, compared to existing methods, SFFNet achieves superior performance in terms of mIoU, reaching 84.80% and 87.73% respectively.The code is located at https://github.com/yysdck/SFFNet.
- R. Li, S. Zheng, C. Duan, L. Wang, and C. Zhang, “Land cover classification from remote sensing images based on multi-scale fully convolutional network,” Geo-spatial information science 25(2), pp. 278–294, 2022.
- D. Marcos, M. Volpi, B. Kellenberger, and D. Tuia, “Land cover mapping at very high resolution with rotation equivariant cnns: Towards small yet accurate models,” ISPRS journal of photogrammetry and remote sensing 145, pp. 96–107, 2018.
- J. Xing, R. Sieber, and T. Caelli, “A scale-invariant change detection method for land use/cover change research,” ISPRS Journal of Photogrammetry and Remote Sensing 141, pp. 252–264, 2018.
- I. de Gélis, S. Lefèvre, and T. Corpetti, “Siamese kpconv: 3d multiple change detection from raw point clouds using deep learning,” ISPRS Journal of Photogrammetry and Remote Sensing 197, pp. 274–291, 2023.
- A. Samie, A. Abbas, M. M. Azeem, S. Hamid, M. A. Iqbal, S. S. Hasan, and X. Deng, “Examining the impacts of future land use/land cover changes on climate in punjab province, pakistan: implications for environmental sustainability and economic growth,” Environmental Science and Pollution Research 27, pp. 25415–25433, 2020.
- F. Chen, H. Balzter, F. Zhou, P. Ren, and H. Zhou, “Dgnet: Distribution guided efficient learning for oil spill image segmentation,” IEEE Transactions on Geoscience and Remote Sensing 61, pp. 1–17, 2023.
- D. Griffiths and J. Boehm, “Improving public data for building segmentation from convolutional neural networks (cnns) for fused airborne lidar and image data using active contours,” ISPRS Journal of Photogrammetry and Remote Sensing 154, pp. 70–83, 2019.
- P. Shamsolmoali, M. Zareapoor, H. Zhou, R. Wang, and J. Yang, “Road segmentation for remote sensing images using adversarial spatial pyramid networks,” IEEE Transactions on Geoscience and Remote Sensing 59(6), pp. 4673–4688, 2020.
- M. C. A. Picoli, G. Camara, I. Sanches, R. Simões, A. Carvalho, A. Maciel, A. Coutinho, J. Esquerdo, J. Antunes, R. A. Begotti, et al., “Big earth observation time series analysis for monitoring brazilian agriculture,” ISPRS journal of photogrammetry and remote sensing 145, pp. 328–339, 2018.
- Y. Shen, J. Chen, L. Xiao, and D. Pan, “Optimizing multiscale segmentation with local spectral heterogeneity measure for high resolution remote sensing images,” ISPRS Journal of Photogrammetry and Remote Sensing 157, pp. 13–25, 2019.
- M. Zhang, W. Li, R. Tao, H. Li, and Q. Du, “Information fusion for classification of hyperspectral and lidar data using ip-cnn,” IEEE Transactions on Geoscience and Remote Sensing 60, pp. 1–12, 2021.
- X. Liu, L. Jiao, L. Li, L. Cheng, F. Liu, S. Yang, and B. Hou, “Deep multiview union learning network for multisource image classification,” IEEE Transactions on Cybernetics 52(6), pp. 4534–4546, 2020.
- X.-Y. Tong, G.-S. Xia, Q. Lu, H. Shen, S. Li, S. You, and L. Zhang, “Land-cover classification with high-resolution remote sensing images using transferable deep models,” Remote Sensing of Environment 237, p. 111322, 2020.
- X. Wang, Z. Hu, S. Shi, M. Hou, L. Xu, and X. Zhang, “A deep learning method for optimizing semantic segmentation accuracy of remote sensing images based on improved unet,” Scientific Reports 13(1), p. 7600, 2023.
- Y. Liu, H. Li, C. Hu, S. Luo, Y. Luo, and C. W. Chen, “Learning to aggregate multi-scale context for instance segmentation in remote sensing images,” IEEE Transactions on Neural Networks and Learning Systems , 2024.
- J. Chen, Y. Lu, Q. Yu, X. Luo, E. Adeli, Y. Wang, L. Lu, A. L. Yuille, and Y. Zhou, “Transunet: Transformers make strong encoders for medical image segmentation,” arXiv preprint arXiv:2102.04306 , 2021.
- J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, and H. Lu, “Dual attention network for scene segmentation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3146–3154, 2019.
- Z. Huang, X. Wang, L. Huang, C. Huang, Y. Wei, and W. Liu, “Ccnet: Criss-cross attention for semantic segmentation,” in Proceedings of the IEEE/CVF international conference on computer vision, pp. 603–612, 2019.
- K.-H. Liu and B.-Y. Lin, “Mscsa-net: Multi-scale channel spatial attention network for semantic segmentation of remote sensing images,” Applied Sciences 13(17), p. 9491, 2023.
- A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929 , 2020.
- Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012–10022, 2021.
- X. He, Y. Zhou, J. Zhao, D. Zhang, R. Yao, and Y. Xue, “Swin transformer embedding unet for remote sensing image semantic segmentation,” IEEE Transactions on Geoscience and Remote Sensing 60, pp. 1–15, 2022.
- L. Fan, Y. Zhou, H. Liu, Y. Li, and D. Cao, “Combining swin transformer with unet for remote sensing image semantic segmentation,” IEEE Transactions on Geoscience and Remote Sensing , 2023.
- C. Xu, R. Wang, S. Lin, X. Luo, B. Zhao, L. Shao, and M. Hu, “Lecture2note: Automatic generation of lecture notes from slide-based educational videos,” in 2019 IEEE International Conference on Multimedia and Expo (ICME), pp. 898–903, IEEE, 2019.
- C. Xu, W. Jia, R. Wang, X. He, B. Zhao, and Y. Zhang, “Semantic navigation of powerpoint-based lecture video for autonote generation,” IEEE Transactions on Learning Technologies 16(1), pp. 1–17, 2022.
- C. Xu, W. Jia, R. Wang, X. Luo, and X. He, “Morphtext: Deep morphology regularized accurate arbitrary-shape scene text detection,” IEEE Trans. Multimedia , 2022.
- C. Xu, H. Fu, L. Ma, W. Jia, C. Zhang, F. Xia, X. Ai, B. Li, and W. Zhang, “Seeing text in the dark: Algorithm and benchmark,” arXiv preprint arXiv:2404.08965 , 2024.
- R. S. Stankovic and B. J. Falkowski, “The haar wavelet transform: its status and achievements,” Computers & Electrical Engineering 29(1), pp. 25–44, 2003.
- C. H. Ma, Y. Li, and Y. Wang, “Image analysis based on the haar wavelet transform,” Applied Mechanics and Materials 391, pp. 564–567, 2013.
- A. Belov, “Comparison of the efficiencies of image compression algorithms based on separable and nonseparable two-dimensional haar wavelet bases,” Pattern Recognition and Image Analysis 18, pp. 602–605, 2008.
- F. Luisier, C. Vonesch, T. Blu, and M. Unser, “Fast haar-wavelet denoising of multidimensional fluorescence microscopy data,” in 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, pp. 310–313, 2009.
- F. Gao, X. Wang, Y. Gao, J. Dong, and S. Wang, “Sea ice change detection in sar images based on convolutional-wavelet neural networks,” IEEE Geoscience and Remote Sensing Letters 16(8), pp. 1240–1244, 2019.
- C. Zhao, B. Xia, W. Chen, L. Guo, J. Du, T. Wang, and B. Lei, “Multi-scale wavelet network algorithm for pediatric echocardiographic segmentation via hierarchical feature guided fusion,” Applied Soft Computing 107, p. 107386, 2021.
- Y. Zhou, J. Huang, C. Wang, L. Song, and G. Yang, “Xnet: Wavelet-based low and high frequency fusion networks for fully-and semi-supervised semantic segmentation of biomedical images,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 21085–21096, 2023.
- K. Wang and D. Ming, “Road extraction from high-resolution remote sensing images based on spectral and shape features,” in MIPPR 2009: Automatic Target Recognition and Image Analysis, 7495, pp. 968–973, SPIE, 2009.
- D. Li, G. Zhang, Z. Wu, and L. Yi, “An edge embedded marker-based watershed algorithm for high spatial resolution remote sensing image segmentation,” IEEE Transactions on Image Processing 19(10), pp. 2781–2787, 2010.
- X. Huang, W. Yuan, J. Li, and L. Zhang, “A new building extraction postprocessing framework for high-spatial-resolution remote-sensing imagery,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 10(2), pp. 654–668, 2016.
- J. Wang, Z. Zheng, A. Ma, X. Lu, and Y. Zhong, “Loveda: A remote sensing land-cover dataset for domain adaptive semantic segmentation,” arXiv preprint arXiv:2110.08733 , 2021.
- J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440, 2015.
- R. Kemker, C. Salvaggio, and C. Kanan, “Algorithms for semantic segmentation of multispectral remote sensing imagery using deep learning,” ISPRS journal of photogrammetry and remote sensing 145, pp. 60–77, 2018.
- L. Ma, Y. Liu, X. Zhang, Y. Ye, G. Yin, and B. A. Johnson, “Deep learning in remote sensing applications: A meta-analysis and review,” ISPRS journal of photogrammetry and remote sensing 152, pp. 166–177, 2019.
- O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pp. 234–241, Springer, 2015.
- R. Li, S. Zheng, C. Duan, J. Su, and C. Zhang, “Multistage attention resu-net for semantic segmentation of fine-resolution remote sensing images,” IEEE Geoscience and Remote Sensing Letters 19, pp. 1–5, 2021.
- W. Qiu, L. Gu, F. Gao, and T. Jiang, “Building extraction from very high-resolution remote sensing images using refine-unet,” IEEE Geoscience and Remote Sensing Letters 20, pp. 1–5, 2023.
- L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam, “Rethinking atrous convolution for semantic image segmentation,” arXiv preprint arXiv:1706.05587 , 2017.
- Z. Tian, X. Guo, X. He, P. Li, X. Cheng, and G. Zhou, “Mscanet: multiscale context information aggregation network for tibetan plateau lake extraction from remote sensing images,” International Journal of Digital Earth 16(1), pp. 1–30, 2023.
- X. Dai, M. Xia, L. Weng, K. Hu, H. Lin, and M. Qian, “Multi-scale location attention network for building and water segmentation of remote sensing image,” IEEE Transactions on Geoscience and Remote Sensing , 2023.
- L. Ding, H. Tang, and L. Bruzzone, “Lanet: Local attention embedding to improve the semantic segmentation of remote sensing images,” IEEE Transactions on Geoscience and Remote Sensing 59(1), pp. 426–435, 2020.
- R. Strudel, R. Garcia, I. Laptev, and C. Schmid, “Segmenter: Transformer for semantic segmentation,” in Proceedings of the IEEE/CVF international conference on computer vision, pp. 7262–7272, 2021.
- H. Cao, Y. Wang, J. Chen, D. Jiang, X. Zhang, Q. Tian, and M. Wang, “Swin-unet: Unet-like pure transformer for medical image segmentation,” in European conference on computer vision, pp. 205–218, Springer, 2022.
- Y. Liu, Y. Zhang, Y. Wang, and S. Mei, “Rethinking transformers for semantic segmentation of remote sensing images,” IEEE Transactions on Geoscience and Remote Sensing , 2023.
- T. Xiao, Y. Liu, Y. Huang, M. Li, and G. Yang, “Enhancing multiscale representations with transformer for remote sensing image semantic segmentation,” IEEE Transactions on Geoscience and Remote Sensing 61, pp. 1–16, 2023.
- P. Liu, H. Zhang, K. Zhang, L. Lin, and W. Zuo, “Multi-level wavelet-cnn for image restoration,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 773–782.
- H. Huang, R. He, Z. Sun, and T. Tan, “Wavelet-srnet: A wavelet-based cnn for multi-scale face super resolution,” in Proceedings of the IEEE international conference on computer vision, pp. 1689–1697, 2017.
- H. Ma, D. Liu, R. Xiong, and F. Wu, “iwave: Cnn-based wavelet-like transform for image compression,” IEEE Transactions on Multimedia 22(7), pp. 1667–1679, 2019.
- Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie, “A convnet for the 2020s,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11976–11986, 2022.
- S. Mehta and M. Rastegari, “Mobilevit: light-weight, general-purpose, and mobile-friendly vision transformer,” arXiv preprint arXiv:2110.02178 , 2021.
- P. K. A. Vasu, J. Gabriel, J. Zhu, O. Tuzel, and A. Ranjan, “Fastvit: A fast hybrid vision transformer using structural reparameterization,” arXiv preprint arXiv:2303.14189 , 2023.
- R. Li, S. Zheng, C. Zhang, C. Duan, L. Wang, and P. M. Atkinson, “Abcnet: Attentive bilateral contextual network for efficient semantic segmentation of fine-resolution remotely sensed imagery,” ISPRS journal of photogrammetry and remote sensing 181, pp. 84–98, 2021.
- R. Li, L. Wang, C. Zhang, C. Duan, and S. Zheng, “A2-fpn for semantic segmentation of fine-resolution remotely sensed images,” International journal of remote sensing 43(3), pp. 1131–1155, 2022.
- L. Wang, R. Li, C. Duan, C. Zhang, X. Meng, and S. Fang, “A novel transformer based semantic segmentation scheme for fine-resolution remote sensing images,” IEEE Geoscience and Remote Sensing Letters 19, pp. 1–5, 2022.
- Q. Wang, X. Luo, J. Feng, G. Zhang, X. Jia, and J. Yin, “Multi-scale prototype contrast network for high-resolution aerial imagery semantic segmentation,” IEEE Transactions on Geoscience and Remote Sensing , 2023.