Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learnable Prompt for Few-Shot Semantic Segmentation in Remote Sensing Domain (2404.10307v1)

Published 16 Apr 2024 in cs.CV and cs.AI

Abstract: Few-shot segmentation is a task to segment objects or regions of novel classes within an image given only a few annotated examples. In the generalized setting, the task extends to segment both the base and the novel classes. The main challenge is how to train the model such that the addition of novel classes does not hurt the base classes performance, also known as catastrophic forgetting. To mitigate this issue, we use SegGPT as our base model and train it on the base classes. Then, we use separate learnable prompts to handle predictions for each novel class. To handle various object sizes which typically present in remote sensing domain, we perform patch-based prediction. To address the discontinuities along patch boundaries, we propose a patch-and-stitch technique by re-framing the problem as an image inpainting task. During inference, we also utilize image similarity search over image embeddings for prompt selection and novel class filtering to reduce false positive predictions. Based on our experiments, our proposed method boosts the weighted mIoU of a simple fine-tuned SegGPT from 15.96 to 35.08 on the validation set of few-shot OpenEarthMap dataset given in the challenge.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. Self-supervised material and texture representation learning for remote sensing tasks. In CVPR, pages 8203–8215, 2022.
  2. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. PAMI, pages 2481–2495, 2017.
  3. Satlaspretrain: A large-scale dataset for remote sensing image understanding. In ICCV, pages 16772–16782, 2023.
  4. John Bridle. Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters. In ICLR, pages 211–217, 1989.
  5. Encoder-decoder with atrous separable convolution for semantic image segmentation. In ECCV, pages 801–818, 2018.
  6. Semantic segmentation of remote sensing imagery based on multiscale deformable cnn and densecrf. Remote Sensing, page 1229, 2023.
  7. Satmae: Pre-training transformers for temporal and multi-spectral satellite imagery. NEURIPS, pages 197–211, 2022.
  8. Few-shot semantic segmentation with prototype learning. In BMVC, page 4, 2018.
  9. An image is worth 16x16 words: Transformers for image recognition at scale. In CVPR, 2022.
  10. A strong baseline for generalized few-shot semantic segmentation. In CVPR, pages 11269–11278, 2023.
  11. Strip pooling: Rethinking spatial pooling for scene parsing. In CVPR, pages 4003–4012, 2020.
  12. Segment anything. In ICCV, pages 4015–4026, 2023.
  13. Learning what not to segment: A new perspective on few-shot segmentation. In CVPR, pages 8057–8067, 2022.
  14. Anti-aliasing semantic reconstruction for few-shot semantic segmentation. In CVPR, pages 9747–9756, 2021.
  15. Learning orthogonal prototypes for generalized few-shot semantic segmentation. In CVPR, 2023.
  16. Fully convolutional networks for semantic segmentation. In CVPR, pages 3431–3440, 2015.
  17. Decoupled weight decay regularization. arXiv preprint, arXiv:1711.05101, 2019.
  18. Change-aware sampling and contrastive learning for satellite images. In CVPR, pages 5261–5270, 2023.
  19. Remote sensing vision-language foundation models without annotations via ground remote alignment. In ICLR, 2024.
  20. Seasonal contrast: Unsupervised pre-training from uncurated remote sensing data. In ICCV, pages 9414–9423, 2021.
  21. Towards geospatial foundation models via continual pretraining. In ICCV, pages 16806–16816, 2023.
  22. Hypercorrelation squeeze for few-shot segmentation. In ICCV, 2021.
  23. Learning transferable visual models from natural language supervision. In ICML, 2021.
  24. U-net: Convolutional networks for biomedical image segmentation. In MICCAI, pages 234–241. Springer, 2015.
  25. One-shot learning for semantic segmentation. arXiv preprint arXiv:1709.03410, 2017.
  26. Segmenter: Transformer for semantic segmentation. In ICCV, pages 7262–7272, 2021.
  27. Prior guided feature enrichment network for few-shot segmentation. PAMI, pages 1050–1065, 2020.
  28. Generalized few-shot semantic segmentation. In CVPR, 2022.
  29. Panet: Few-shot image semantic segmentation with prototype alignment. In ICCV, pages 9197–9206, 2019.
  30. A deep learning method for optimizing semantic segmentation accuracy of remote sensing images based on improved unet. Scientific reports, page 7600, 2023a.
  31. Images speak in images: A generalist painter for in-context visual learning. In CVPR, 2023b.
  32. SegGPT: Segmenting everything in context. In ICCV, 2023c.
  33. Openearthmap: A benchmark dataset for global high-resolution land cover mapping. In WACV, pages 6254–6264, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
Citations (1)

Summary

We haven't generated a summary for this paper yet.