Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CLiSA: A Hierarchical Hybrid Transformer Model using Orthogonal Cross Attention for Satellite Image Cloud Segmentation (2311.17475v2)

Published 29 Nov 2023 in cs.CV and eess.IV

Abstract: Clouds in optical satellite images are a major concern since their presence hinders the ability to carry accurate analysis as well as processing. Presence of clouds also affects the image tasking schedule and results in wastage of valuable storage space on ground as well as space-based systems. Due to these reasons, deriving accurate cloud masks from optical remote-sensing images is an important task. Traditional methods such as threshold-based, spatial filtering for cloud detection in satellite images suffer from lack of accuracy. In recent years, deep learning algorithms have emerged as a promising approach to solve image segmentation problems as it allows pixel-level classification and semantic-level segmentation. In this paper, we introduce a deep-learning model based on hybrid transformer architecture for effective cloud mask generation named CLiSA - Cloud segmentation via Lipschitz Stable Attention network. In this context, we propose an concept of orthogonal self-attention combined with hierarchical cross attention model, and we validate its Lipschitz stability theoretically and empirically. We design the whole setup under adversarial setting in presence of Lov\'asz-Softmax loss. We demonstrate both qualitative and quantitative outcomes for multiple satellite image datasets including Landsat-8, Sentinel-2, and Cartosat-2s. Performing comparative study we show that our model performs preferably against other state-of-the-art methods and also provides better generalization in precise cloud extraction from satellite multi-spectral (MX) images. We also showcase different ablation studies to endorse our choices corresponding to different architectural elements and objective functions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. F. Rezaei, H. Izadi, H. Memarian, and M. Baniassadi, “The effectiveness of different thresholding techniques in segmenting micro ct images of porous carbonates to estimate porosity,” Journal of Petroleum Science and Engineering, vol. 177, pp. 518–527, 2019.
  2. F. M. Abubakar, “A study of region-based and contourbased image segmentation,” Signal & Image Processing : An International Journal, vol. 3, pp. 15–22, 2012.
  3. Z. Zhu and C. Woodcock, “Object-based cloud and cloud shadow detection in landsat imagery,” Remote Sensing of Environment, vol. 118, p. 83–94, 03 2012.
  4. D. Frantz, E. Haß, A. Uhl, J. Stoffels, and J. Hill, “Improvement of the fmask algorithm for sentinel-2 images: Separating clouds from bright surfaces based on parallax effects,” Remote Sensing of Environment, vol. 215, pp. 471–481, 2018.
  5. A. Francis, P. Sidiropoulos, and J.-P. Muller, “Cloudfcn: Accurate and robust cloud detection for satellite imagery with deep learning,” Remote. Sens., vol. 11, p. 2312, 2019.
  6. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” CoRR, vol. abs/1505.04597, 2015. [Online]. Available: http://arxiv.org/abs/1505.04597
  7. J. Yang, J. Guo, H. Yue, Z. Liu, H. Hu, and K. Li, “Cdnet: Cnn-based cloud detection for remote sensing imagery,” IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 8, pp. 6195–6211, 2019.
  8. J. Guo, J. Yang, H. Yue, H. Tan, C. Hou, and K. Li, “Cdnetv2: Cnn-based cloud detection for remote sensing imagery with cloud-snow coexistence,” IEEE TGRS, vol. PP, pp. 1–14, 05 2020.
  9. J. H. Jeppesen, R. H. Jacobsen, F. Inceoglu, and T. S. Toftegaard, “A cloud detection algorithm for satellite imagery based on deep learning,” Remote sensing of environment, vol. 229, pp. 247–259, 2019.
  10. L. Jiao, L.-Z. Huo, C. Hu, P. Tang, and Z. Zhang, “Refined unet v4: End-to-end patch-wise network for cloud and shadow segmentation with bilateral grid,” Remote Sensing, vol. 14, p. 358, 01 2022.
  11. A. D. et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” CoRR, vol. abs/2010.11929, 2020. [Online]. Available: https://arxiv.org/abs/2010.11929
  12. E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Álvarez, and P. Luo, “Segformer: Simple and efficient design for semantic segmentation with transformers,” CoRR, vol. abs/2105.15203, 2021. [Online]. Available: https://arxiv.org/abs/2105.15203
  13. Y. Gao, M. Zhou, and D. N. Metaxas, “Utnet: A hybrid transformer architecture for medical image segmentation,” CoRR, vol. abs/2107.00781, 2021. [Online]. Available: https://arxiv.org/abs/2107.00781
  14. P. M. Long and H. Sedghi, “Size-free generalization bounds for convolutional neural networks,” CoRR, vol. abs/1905.12600, 2019. [Online]. Available: http://arxiv.org/abs/1905.12600
  15. A. K. Sinha, “Improving the lipschitz stability in spectral transformer through nearest neighbour coupling,” in ICML Workshop on the Synergy of Scientific and Machine Learning Modeling, 2023.
  16. G. Sohaliya and K. Sharma, “Semantic segmentation using generative adversarial networks with a feature reconstruction loss,” in 2021 Asian Conference on Innovation in Technology (ASIANCON), 2021, pp. 1–7.
  17. P. Isola, J. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” CoRR, vol. abs/1611.07004, 2016. [Online]. Available: http://arxiv.org/abs/1611.07004
  18. S. W. Zamir, A. Arora, S. H. Khan, M. Hayat, F. S. Khan, and M. Yang, “Restormer: Efficient transformer for high-resolution image restoration,” CoRR, vol. abs/2111.09881, 2021. [Online]. Available: https://arxiv.org/abs/2111.09881
  19. H. Liu, F. Liu, X. Fan, and D. Huang, “Polarized self-attention: Towards high-quality pixel-wise regression,” CoRR, vol. abs/2107.00782, 2021. [Online]. Available: https://arxiv.org/abs/2107.00782
  20. A. K. Sinha and M. M. S, “Lips-specformer: Non-linear interpolable transformer for spectral reconstruction using adjacent channel coupling,” in 34th British Machine Vision Conference 2023, BMVC 2023, Aberdeen, UK, November 20-24, 2023.   BMVA, 2023.
  21. G. Dasoulas, K. Scaman, and A. Virmaux, “Lipschitz normalization for self-attention layers with application to graph neural networks,” CoRR, vol. abs/2103.04886, 2021. [Online]. Available: https://arxiv.org/abs/2103.04886
  22. M. Berman, A. Rannen, and M. Blaschko, “The lovasz-softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks,” 06 2018, pp. 4413–4421.
  23. L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution for semantic image segmentation,” in European Conference on Computer Vision, 2018.
  24. H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, “Pyramid scene parsing network,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 6230–6239.
  25. S. Foga, P. Scaramuzza, S. Guo, Z. Zhu, R. Jr, T. Beckmann, G. Schmidt, J. Dwyer, M. Hughes, and B. Laue, “Cloud detection algorithm comparison and validation for operational landsat data products,” Remote Sensing of Environment, vol. 194, 06 2017.
  26. M. Hughes and D. Hayes, “Automated detection of cloud and cloud shadow in single-date landsat imagery using neural networks and spatial post-processing,” Remote Sensing, vol. 6, pp. 4907–4926, 05 2014.
  27. C. t. Aybar, “Cloudsen12, a global dataset for semantic understanding of cloud and cloud shadow in sentinel-2,” Scientific Data, vol. 9, p. 782, 12 2022.
  28. B. Cheng, R. Girshick, P. Dollár, A. C. Berg, and A. Kirillov, “Boundary IoU: Improving object-centric image segmentation evaluation,” in CVPR, 2021.
  29. B. Zhang, D. Jiang, D. He, and L. Wang, “Rethinking lipschitz neural networks for certified l-infinity robustness,” 10 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Subhajit Paul (22 papers)
  2. Ashutosh Gupta (27 papers)
Citations (1)