Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

RS-Mamba for Large Remote Sensing Image Dense Prediction (2404.02668v2)

Published 3 Apr 2024 in cs.CV

Abstract: Context modeling is critical for remote sensing image dense prediction tasks. Nowadays, the growing size of very-high-resolution (VHR) remote sensing images poses challenges in effectively modeling context. While transformer-based models possess global modeling capabilities, they encounter computational challenges when applied to large VHR images due to their quadratic complexity. The conventional practice of cropping large images into smaller patches results in a notable loss of contextual information. To address these issues, we propose the Remote Sensing Mamba (RSM) for dense prediction tasks in large VHR remote sensing images. RSM is specifically designed to capture the global context of remote sensing images with linear complexity, facilitating the effective processing of large VHR images. Considering that the land covers in remote sensing images are distributed in arbitrary spatial directions due to characteristics of remote sensing over-head imaging, the RSM incorporates an omnidirectional selective scan module to globally model the context of images in multiple directions, capturing large spatial features from various directions. Extensive experiments on semantic segmentation and change detection tasks across various land covers demonstrate the effectiveness of the proposed RSM. We designed simple yet effective models based on RSM, achieving state-of-the-art performance on dense prediction tasks in VHR remote sensing images without fancy training strategies. Leveraging the linear complexity and global modeling capabilities, RSM achieves better efficiency and accuracy than transformer-based models on large remote sensing images. Interestingly, we also demonstrated that our model generally performs better with a larger image size on dense prediction tasks. Our code is available at https://github.com/walking-shadow/Official_Remote_Sensing_Mamba.

Overview of an Incomplete Paper Submission

The provided content of the paper appears to be an incomplete document structured to include references and formatting specifications typical of academic research articles. The LaTeX formatting suggests an intention to compile a scholarly document perhaps within the domain of computer science or a closely related field.

Document Structure and Intent

The document uses a basic LaTeX class specification, article, which is commonly employed for various types of academic manuscripts, including research papers, review articles, and technical notes. The reference section is invoked using \nocite{*}, indicating that the intention was to include all entries cited in the accompanying bibliography, paper.bib. Furthermore, the bibliographic style specified, IEEEtran, is frequently utilized in technical fields, especially those related to engineering and computer science, to format citations and references in accordance to standards laid out by the Institute of Electrical and Electronics Engineers (IEEE).

Implications and Directions for Future Research

While the contents of the specific research paper are not provided, it is reasonable to infer several potential futures for such an endeavor purely from the formatting and structural decisions. The use of LaTeX and the IEEE citation format point towards a rigorous presentation suited for high-quality research outlets. Assuming the paper aspires to contribute to a technical field, common implications could involve the advancement in methodologies, development of novel algorithms, or insights into state-of-the-art technologies that enrich the theoretical or practical landscapes of the addressed topic.

Future developments stemming from such a research effort might include:

  • Further refinement or expansion of proposed methodologies.
  • Empirical studies that validate initial theoretical claims or hypotheses.
  • Cross-disciplinary applications of the concepts discussed, which might enrich fields that benefit from robust computational approaches.

Conclusion

The fragmentary nature of the content limits the possibility to clearly elucidate the specific themes or numerical insights the original paper might have intended to present. However, it sets a foundational structure from which a comprehensive academic document could be developed. Scholars and practitioners engaged in related technical disciplines frequently use such structured documents to communicate complex ideas efficiently and with precision. The speculative directions suggested here serve as a generic outline for potential research endeavors typically accompanied by an organized LaTeX paper formatted in IEEE style.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. T. Wellmann, A. Lausch, E. Andersson, S. Knapp, C. Cortinovis, J. Jache, S. Scheuer, P. Kremer, A. Mascarenhas, R. Kraemer et al., “Remote sensing in urban planning: Contributions towards ecologically sound policies?” Landscape and urban planning, vol. 204, p. 103921, 2020.
  2. M. Weiss, F. Jacob, and G. Duveiller, “Remote sensing for agricultural applications: A meta-review,” Remote sensing of environment, vol. 236, p. 111402, 2020.
  3. S. Asadzadeh, W. J. de Oliveira, and C. R. de Souza Filho, “Uav-based remote sensing for the petroleum industry and environmental monitoring: State-of-the-art and perspectives,” Journal of Petroleum Science and Engineering, vol. 208, p. 109633, 2022.
  4. Z. Zheng, Y. Zhong, J. Wang, A. Ma, and L. Zhang, “Building damage assessment for rapid disaster response with a deep object-based semantic change detection framework: From natural disasters to man-made disasters,” Remote Sensing of Environment, vol. 265, p. 112636, 2021.
  5. A. Gu and T. Dao, “Mamba: Linear-time sequence modeling with selective state spaces,” arXiv preprint arXiv:2312.00752, 2023.
  6. R. E. Kalman, “A new approach to linear filtering and prediction problems,” 1960.
  7. L. Zhu, B. Liao, Q. Zhang, X. Wang, W. Liu, and X. Wang, “Vision mamba: Efficient visual representation learning with bidirectional state space model,” arXiv preprint arXiv:2401.09417, 2024.
  8. Y. Liu, Y. Tian, Y. Zhao, H. Yu, L. Xie, Y. Wang, Q. Ye, and Y. Liu, “Vmamba: Visual state space model,” arXiv preprint arXiv:2401.10166, 2024.
  9. M. Papadomanolaki, S. Verma, M. Vakalopoulou, S. Gupta, and K. Karantzalos, “Detecting urban changes with recurrent neural networks from multitemporal Sentinel-2 data,” in IGARSS 2019-2019 IEEE International Geoscience and Remote Sensing Symposium.   IEEE, 2019, pp. 214–217.
  10. O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18.   Springer, 2015, pp. 234–241.
  11. X. Shi, Z. Chen, H. Wang, D.-Y. Yeung, W.-K. Wong, and W.-c. Woo, “Convolutional LSTM network: A machine learning approach for precipitation nowcasting,” Advances in Neural Information Processing Systems, vol. 28, 2015.
  12. S. Zhao, X. Zhang, P. Xiao, and G. He, “Exchanging dual-encoder–decoder: A new strategy for change detection with semantic guidance and spatial localization,” IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1–16, 2023.
  13. F. Gu, P. Xiao, X. Zhang, Z. Li, and D. Muhtar, “Fdff-net: A full-scale difference feature fusion network for change detection in high-resolution remote sensing images,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2023.
  14. H. Chen, Z. Qi, and Z. Shi, “Remote sensing image change detection with transformers,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–14, 2021.
  15. C. Zhang, L. Wang, S. Cheng, and Y. Li, “Swinsunet: Pure transformer network for remote sensing image change detection,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–13, 2022.
  16. A. Gu, K. Goel, and C. Ré, “Efficiently modeling long sequences with structured state spaces,” arXiv preprint arXiv:2111.00396, 2021.
  17. A. Gu, I. Johnson, K. Goel, K. Saab, T. Dao, A. Rudra, and C. Ré, “Combining recurrent, convolutional, and continuous-time models with linear state space layers,” Advances in neural information processing systems, vol. 34, pp. 572–585, 2021.
  18. J. T. Smith, A. Warrington, and S. W. Linderman, “Simplified state space layers for sequence modeling,” arXiv preprint arXiv:2208.04933, 2022.
  19. A. Gu, T. Dao, S. Ermon, A. Rudra, and C. Ré, “Hippo: Recurrent memory with optimal polynomial projections,” Advances in neural information processing systems, vol. 33, pp. 1474–1487, 2020.
  20. A. Gu, K. Goel, A. Gupta, and C. Ré, “On the parameterization and initialization of diagonal state space models,” Advances in Neural Information Processing Systems, vol. 35, pp. 35 971–35 983, 2022.
  21. A. Gupta, A. Gu, and J. Berant, “Diagonal state spaces are as effective as structured state spaces,” Advances in Neural Information Processing Systems, vol. 35, pp. 22 982–22 994, 2022.
  22. R. Hasani, M. Lechner, T.-H. Wang, M. Chahine, A. Amini, and D. Rus, “Liquid structural state-space models,” arXiv preprint arXiv:2209.12951, 2022.
  23. S. Ji, S. Wei, and M. Lu, “Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery dataset,” IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 1, pp. 574–586, 2018.
  24. H. Chen and Z. Shi, “A spatial-temporal attention-based method and a new dataset for remote sensing image change detection,” Remote Sensing, vol. 12, no. 10, p. 1662, 2020.
  25. J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431–3440.
  26. V. Badrinarayanan, A. Kendall, and R. Cipolla, “Segnet: A deep convolutional encoder-decoder architecture for image segmentation,” IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 12, pp. 2481–2495, 2017.
  27. H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, “Pyramid scene parsing network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 2881–2890.
  28. K. Sun, B. Xiao, D. Liu, and J. Wang, “Deep high-resolution representation learning for human pose estimation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 5693–5703.
  29. S. Wei, S. Ji, and M. Lu, “Toward automatic building footprint delineation from aerial images using cnn and regularization,” IEEE Transactions on Geoscience and Remote Sensing, vol. 58, no. 3, pp. 2178–2189, 2019.
  30. L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution for semantic image segmentation,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 801–818.
  31. Y. Xu, L. Wu, Z. Xie, and Z. Chen, “Building extraction in very high resolution remote sensing imagery using deep learning and guided filters,” Remote Sensing, vol. 10, no. 1, p. 144, 2018.
  32. Q. Zhu, C. Liao, H. Hu, X. Mei, and H. Li, “Map-net: Multiple attending path neural network for building footprint extraction from remote sensed imagery,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 7, pp. 6169–6181, 2020.
  33. L. Zhou, C. Zhang, and M. Wu, “D-linknet: Linknet with pretrained encoder and dilated convolution for high resolution satellite imagery road extraction,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2018, pp. 182–186.
  34. C. Tao, J. Qi, Y. Li, H. Wang, and H. Li, “Spatial information inference net: Road extraction using road-specific contextual information,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 158, pp. 155–166, 2019.
  35. E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, and P. Luo, “Segformer: Simple and efficient design for semantic segmentation with transformers,” Advances in neural information processing systems, vol. 34, pp. 12 077–12 090, 2021.
  36. X. Jiang, Y. Li, T. Jiang, J. Xie, Y. Wu, Q. Cai, J. Jiang, J. Xu, and H. Zhang, “Roadformer: Pyramidal deformable vision transformers for road network extraction with remote sensing images,” International Journal of Applied Earth Observation and Geoinformation, vol. 113, p. 102987, 2022.
  37. L. Luo, J.-X. Wang, S.-B. Chen, J. Tang, and B. Luo, “Bdtnet: Road extraction by bi-direction transformer from remote sensing images,” IEEE Geoscience and Remote Sensing Letters, vol. 19, pp. 1–5, 2022.
  38. R. C. Daudt, B. Le Saux, and A. Boulch, “Fully convolutional siamese networks for change detection,” in 2018 25th IEEE International Conference on Image Processing (ICIP).   IEEE, 2018, pp. 4063–4067.
  39. Y. Liu, C. Pang, Z. Zhan, X. Zhang, and X. Yang, “Building change detection for remote sensing images using a dual-task constrained deep siamese convolutional network model,” IEEE Geoscience and Remote Sensing Letters, vol. 18, no. 5, pp. 811–815, 2020.
  40. S. Fang, K. Li, J. Shao, and Z. Li, “SNUNet-CD: A densely connected siamese network for change detection of vhr images,” IEEE Geoscience and Remote Sensing Letters, vol. 19, pp. 1–5, 2021.
  41. H. Chen, W. Li, and Z. Shi, “Adversarial instance augmentation for building change detection in remote sensing images,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–16, 2021.
  42. X. Peng, R. Zhong, Z. Li, and Q. Li, “Optical remote sensing image change detection based on attention mechanism and image difference,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 9, pp. 7296–7307, 2020.
  43. J. Chen, Z. Yuan, J. Peng, L. Chen, H. Huang, J. Zhu, Y. Liu, and H. Li, “Dasnet: Dual attentive fully convolutional siamese networks for change detection in high-resolution satellite images,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 14, pp. 1194–1206, 2020.
  44. C. Zhang, P. Yue, D. Tapete, L. Jiang, B. Shangguan, L. Huang, and G. Liu, “A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 166, pp. 183–200, 2020.
  45. W. G. C. Bandara and V. M. Patel, “A transformer-based siamese network for change detection,” in IGARSS 2022-2022 IEEE International Geoscience and Remote Sensing Symposium.   IEEE, 2022, pp. 207–210.
  46. W. Wang, X. Tan, P. Zhang, and X. Wang, “A cbam based multiscale transformer fusion approach for remote sensing image change detection,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 15, pp. 6817–6825, 2022.
  47. M. Liu, Z. Chai, H. Deng, and R. Liu, “A cnn-transformer network with multiscale context aggregation for fine-grained cropland change detection,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 15, pp. 4297–4306, 2022.
  48. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems 32.   Curran Associates, Inc., 2019, pp. 8024–8035.
  49. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Sijie Zhao (15 papers)
  2. Hao Chen (1006 papers)
  3. Xueliang Zhang (39 papers)
  4. Pengfeng Xiao (9 papers)
  5. Lei Bai (154 papers)
  6. Wanli Ouyang (358 papers)
Citations (41)
Github Logo Streamline Icon: https://streamlinehq.com