Texture-Semantic Collaboration Network for ORSI Salient Object Detection (2312.03548v1)
Abstract: Salient object detection (SOD) in optical remote sensing images (ORSIs) has become increasingly popular recently. Due to the characteristics of ORSIs, ORSI-SOD is full of challenges, such as multiple objects, small objects, low illuminations, and irregular shapes. To address these challenges, we propose a concise yet effective Texture-Semantic Collaboration Network (TSCNet) to explore the collaboration of texture cues and semantic cues for ORSI-SOD. Specifically, TSCNet is based on the generic encoder-decoder structure. In addition to the encoder and decoder, TSCNet includes a vital Texture-Semantic Collaboration Module (TSCM), which performs valuable feature modulation and interaction on basic features extracted from the encoder. The main idea of our TSCM is to make full use of the texture features at the lowest level and the semantic features at the highest level to achieve the expression enhancement of salient regions on features. In the TSCM, we first enhance the position of potential salient regions using semantic features. Then, we render and restore the object details using the texture features. Meanwhile, we also perceive regions of various scales, and construct interactions between different regions. Thanks to the perfect combination of TSCM and generic structure, our TSCNet can take care of both the position and details of salient objects, effectively handling various scenes. Extensive experiments on three datasets demonstrate that our TSCNet achieves competitive performance compared to 14 state-of-the-art methods. The code and results of our method are available at https://github.com/MathLee/TSCNet.
- C. Gong et al., “Saliency propagation from simple to difficult,” in Proc. IEEE CVPR, Jun. 2015, pp. 2531–2539.
- X. Zhou, Z. Liu, C. Gong, and W. Liu, “Improving video saliency detection via localized estimation and spatiotemporal refinement,” IEEE Trans. Multimedia, vol. 20, no. 11, pp. 2993–3007, Nov. 2018.
- K. Fu, C. Gong, I. Y.-H. Gu, and J. Yang, “Normalized cut-based saliency detection by adaptive multi-level region merging,” IEEE Trans. Image Process., vol. 24, no. 12, pp. 5671–5683, Dec. 2015.
- C. Li et al., “Nested network with two-stream pyramid for salient object detection in optical remote sensing images,” IEEE Trans. Geosci. Remote Sens., vol. 57, no. 11, pp. 9156–9166, Nov. 2019.
- Q. Zhang et al., “Dense attention fluid network for salient object detection in optical remote sensing images,” IEEE Trans. Image Process., vol. 30, pp. 1305–1317, 2021.
- R. Cong et al., “RRNet: Relational reasoning network with parallel multiscale attention for salient object detection in optical remote sensing images,” IEEE Trans. Geosci. Remote Sens., vol. 60, pp. 1–11, 2022.
- G. Li, Z. Bai, Z. Liu, X. Zhang, and H. Ling, “Salient object detection in optical remote sensing images driven by transformer,” IEEE Trans. Image Process., vol. 32, pp. 5257–5269, Sept. 2023.
- J. Li, Z. Pan, Q. Liu, and Z. Wang, “Stacked U-shape network with channel-wise attention for salient object detection,” IEEE Trans. Multimedia, vol. 23, pp. 1397–1409, 2021.
- B. Xu, H. Liang, R. Liang, and P. Chen, “Locate globally, segment locally: A progressive architecture with knowledge review network for salient object detection,” in Proc. AAAI, Feb. 2021, pp. 3004–3012.
- N. Liu, N. Zhang, K. Wan, L. Shao, and J. Han, “Visual saliency transformer,” in Proc. IEEE ICCV, Oct. 2021, pp. 4702–4712.
- P. Zhang, D. Wang, H. Lu, H. Wang, and X. Ruan, “Amulet: Aggregating multi-level convolutional features for salient object detection,” in Proc. IEEE ICCV, Oct. 2017, pp. 202–211.
- Y. Zeng, P. Zhang, Z. Lin, J. Zhang, and H. Lu, “Towards high-resolution salient object detection,” in Proc. IEEE ICCV, Oct. 2019, pp. 7233–7242.
- P. Zhang, W. Liu, Y. Zeng, Y. Lei, and H. Lu, “Looking for the detail and context devils: High-resolution salient object detection,” IEEE Trans. Image Process., vol. 30, pp. 3204–3216, 2021.
- X. Deng, P. Zhang, W. Liu, and H. Lu, “Recurrent multi-scale transformer for high-resolution salient object detection,” in Proc. ACM MM, Oct. 2023, pp. 7413–7423.
- G. Li, Z. Liu, W. Lin, and H. Ling, “Multi-content complementation network for salient object detection in optical remote sensing images,” IEEE Trans. Geosci. Remote Sens., vol. 60, pp. 1–13, 2022.
- X. Zhou et al., “Edge-guided recurrent positioning network for salient object detection in optical remote sensing images,” IEEE Trans. Cybern., vol. 53, no. 1, pp. 539–552, Jan. 2023.
- G. Li, Z. Liu, Z. Bai, W. Lin, and H. Ling, “Lightweight salient object detection in optical remote sensing images via feature correlation,” IEEE Trans. Geosci. Remote Sens., vol. 60, pp. 1–12, 2022.
- Q. Wang, Y. Liu, Z. Xiong, and Y. Yuan, “Hybrid feature aligned network for salient object detection in optical remote sensing imagery,” IEEE Trans. Geosci. Remote Sens., vol. 60, pp. 1–15, 2022.
- G. Li et al., “Adjacent context coordination network for salient object detection in optical remote sensing images,” IEEE Trans. Cybern., vol. 53, no. 1, pp. 526–538, Jan. 2023.
- X. Zhou et al., “Edge-aware multiscale feature integration network for salient object detection in optical remote sensing images,” IEEE Trans. Geosci. Remote Sens., vol. 60, pp. 1–15, 2022.
- Z. Tu, C. Wang, C. Li, M. Fan, H. Zhao, and B. Luo, “ORSI salient object detection via multiscale joint region and boundary model,” IEEE Trans. Geosci. Remote Sens., vol. 60, pp. 1–13, 2022.
- Y. Liu, S. Zhang, Z. Wang, B. Zhao, and L. Zou, “Global perception network for salient object detection in remote sensing images,” IEEE Trans. Geosci. Remote Sens., vol. 60, pp. 1–12, 2022.
- G. Li et al., “Lightweight salient object detection in optical remote-sensing images via semantic matching and edge alignment,” IEEE Trans. Geosci. Remote Sens., vol. 61, pp. 1–12, 2023.
- O. Ali et al., “Implementation of a modified U-Net for medical image segmentation on edge devices,” IEEE Trans. Circuits Syst. II-Express Briefs, vol. 69, no. 11, pp. 4593–4597, Nov. 2022.
- A. Dosovitskiy et al., “An image is worth 16×\times×16 words: Transformers for image recognition at scale,” in Proc. ICLR, 2021.
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proc. ICLR, May 2015, pp. 1–14.
- S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “CBAM: Convolutional block attention module,” in Proc. ECCV, Sept. 2018, pp. 3–19.
- A. Vaswani et al., “Attention is all you need,” in Proc. NeurIPS, Dec. 2017, pp. 6000–6010.
- F. Yu and V. Koltun, “Multi-scale context aggregation by dilated convolutions,” in Proc. ICLR, May 2016, pp. 1–13.
- D.-P. Fan et al., “Structure-measure: A new way to evaluate foreground maps,” in Proc. IEEE ICCV, Oct. 2017, pp. 4548–4557.
- R. Achanta et al., “Frequency-tuned salient region detection,” in Proc. IEEE CVPR, Jun. 2009, pp. 1597–1604.
- D.-P. Fan et al., “Enhanced-alignment measure for binary foreground map evaluation,” in Proc. IJCAI, Jul. 2018, pp. 698–704.