MSFMamba: Multi-Scale Feature Fusion State Space Model for Multi-Source Remote Sensing Image Classification (2408.14255v2)
Abstract: In the field of multi-source remote sensing image classification, remarkable progress has been made by using Convolutional Neural Network (CNN) and Transformer. Recently, Mamba-based methods built upon the State Space Model (SSM) have shown great potential for long-range dependency modeling with linear complexity, but they have rarely been explored for multi-source remote sensing image classification tasks. To address this issue, we propose the Multi-Scale Feature Fusion Mamba (MSFMamba) network, a novel framework designed for the joint classification of hyperspectral image (HSI) and Light Detection and Ranging (LiDAR)/Synthetic Aperture Radar (SAR) data. The MSFMamba network is composed of three key components: the Multi-Scale Spatial Mamba (MSpa-Mamba) block, the Spectral Mamba (Spe-Mamba) block, and the Fusion Mamba (Fus-Mamba) block. The MSpa-Mamba block employs a multi-scale strategy to reduce computational cost and alleviate feature redundancy in multiple scanning routes, ensuring efficient spatial feature modeling. The Spe-Mamba block focuses on spectral feature extraction, addressing the unique challenges of HSI data representation. Finally, the Fus-Mamba block bridges the heterogeneous gap between HSI and LiDAR/SAR data by extending the original Mamba architecture to accommodate dual inputs, enhancing cross-modal feature interactions and enabling seamless data fusion. Together, these components enable MSFMamba to effectively tackle the challenges of multi-source data classification, delivering improved performance with optimized computational efficiency. Comprehensive experiments on four real-world multi-source remote sensing datasets demonstrate the superiority of MSFMamba outperforms several state-of-the-art methods. The source codes of MSFMamba are publicly available at https://github.com/oucailab/MSFMamba.