S3Net: Innovating Stereo Matching and Semantic Segmentation with a Single-Branch Semantic Stereo Network in Satellite Epipolar Imagery (2401.01643v3)
Abstract: Stereo matching and semantic segmentation are significant tasks in binocular satellite 3D reconstruction. However, previous studies primarily view these as independent parallel tasks, lacking an integrated multitask learning framework. This work introduces a solution, the Single-branch Semantic Stereo Network (S3Net), which innovatively combines semantic segmentation and stereo matching using Self-Fuse and Mutual-Fuse modules. Unlike preceding methods that utilize semantic or disparity information independently, our method dentifies and leverages the intrinsic link between these two tasks, leading to a more accurate understanding of semantic information and disparity estimation. Comparative testing on the US3D dataset proves the effectiveness of our S3Net. Our model improves the mIoU in semantic segmentation from 61.38 to 67.39, and reduces the D1-Error and average endpoint error (EPE) in disparity estimation from 10.051 to 9.579 and 1.439 to 1.403 respectively, surpassing existing competitive methods. Our codes are available at:https://github.com/CVEO/S3Net.
- “A linear pushbroom satellite image epipolar resampling method for digital surface model generation,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 190, pp. 56–68, 2022.
- “Semantic stereo for incidental satellite images,” in 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2019, pp. 1524–1532.
- “S2net: A multitask learning network for semantic stereo of satellite image pairs,” IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–13, 2024.
- “Pyramid stereo matching network,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 5410–5418.
- “Group-wise correlation stereo network,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 3273–3282.
- “Ga-net: Guided aggregation net for end-to-end stereo matching,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 185–194.
- “Cfnet: Cascade and fused cost volume for robust stereo matching,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 13906–13915.
- “Segformer: Simple and efficient design for semantic segmentation with transformers,” Advances in Neural Information Processing Systems, vol. 34, pp. 12077–12090, 2021.
- “Pyramid scene parsing network,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2881–2890.
- “Sdfcnv2: An improved fcn framework for remote sensing images semantic segmentation,” Remote Sensing, vol. 13, no. 23, pp. 4902, 2021.
- “High-resolution representations for labeling pixels and regions,” arXiv preprint arXiv:1904.04514, 2019.