Papers
Topics
Authors
Recent
Search
2000 character limit reached

Adapting the Segment Anything Model During Usage in Novel Situations

Published 12 Apr 2024 in cs.CV | (2404.08421v1)

Abstract: The interactive segmentation task consists in the creation of object segmentation masks based on user interactions. The most common way to guide a model towards producing a correct segmentation consists in clicks on the object and background. The recently published Segment Anything Model (SAM) supports a generalized version of the interactive segmentation problem and has been trained on an object segmentation dataset which contains 1.1B masks. Though being trained extensively and with the explicit purpose of serving as a foundation model, we show significant limitations of SAM when being applied for interactive segmentation on novel domains or object types. On the used datasets, SAM displays a failure rate $\text{FR}{30}@90$ of up to $72.6 \%$. Since we still want such foundation models to be immediately applicable, we present a framework that can adapt SAM during immediate usage. For this we will leverage the user interactions and masks, which are constructed during the interactive segmentation process. We use this information to generate pseudo-labels, which we use to compute a loss function and optimize a part of the SAM model. The presented method causes a relative reduction of up to $48.1 \%$ in the $\text{FR}{20}@85$ and $46.6 \%$ in the $\text{FR}_{30}@90$ metrics.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. Interactive full image segmentation by considering all regions jointly. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11622–11631, 2019.
  2. Fakhre Alam. Leaf disease segmentation. https://www.kaggle.com/datasets/fakhrealam9537/leaf-disease-segmentation-dataset, 2021. Accessed: 2023-09-14.
  3. Efficient full image interactive segmentation by leveraging within-image appearance similarity. arXiv preprint arXiv:2007.08173, 2020.
  4. Error-tolerant scribbles based interactive image segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 392–399, 2014.
  5. Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Computerized medical imaging and graphics, 43:99–111, 2015.
  6. Sam fails to segment anything?–sam-adapter: Adapting sam in underperformed scenes: Camouflage, shadow, and more. arXiv preprint arXiv:2304.09148, 2023a.
  7. Conditional diffusion for interactive segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7345–7354, 2021.
  8. Focalclick: Towards practical interactive image segmentation. In CVPR, 2022.
  9. Scribbleseg: Scribble-based interactive image segmentation. arXiv preprint arXiv:2303.11320, 2023b.
  10. Sam-med2d, 2023.
  11. Adapting segment anything model for change detection in hr remote sensing images. arXiv preprint arXiv:2309.01429, 2023.
  12. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  13. Ucp-net: unstructured contour points for instance segmentation. In 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pages 3373–3379. IEEE, 2021.
  14. Instance segmentation for autonomous log grasping in forestry operations. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 6064–6071. IEEE, 2022.
  15. Edgeflow: Achieving practical interactive segmentation with edge-guided flow. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1551–1560, 2021.
  16. Rais: Robust and accurate interactive segmentation via continual learning. arXiv preprint arXiv:2210.10984, 2022.
  17. Trashcan: A semantically-segmented dataset towards visual detection of marine debris. ArXiv, abs/2007.08097, 2020.
  18. Interactive image segmentation via backpropagating refinement scheme. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5297–5306, 2019.
  19. Kvasir-seg: A segmented polyp dataset. In MultiMedia Modeling: 26th International Conference, MMM 2020, Daejeon, South Korea, January 5–8, 2020, Proceedings, Part II 26, pages 451–462. Springer, 2020.
  20. Kvasir-instrument: Diagnostic and therapeutic tool segmentation dataset in gastrointestinal endoscopy. In MultiMedia Modeling, pages 218–229, Cham, 2021. Springer International Publishing.
  21. Segment anything in high quality. In NeurIPS, 2023.
  22. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  23. Segment anything. arXiv e-prints, pages arXiv–2304, 2023.
  24. Continuous adaptation for interactive object segmentation by learning from corrections. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVI 16, pages 579–596. Springer, 2020.
  25. Anabranch network for camouflaged object segmentation. Journal of Computer Vision and Image Understanding, 184:45–56, 2019.
  26. Interactive learning for semantic segmentation in earth observation. In ECML-PKDD 2020, MACLEAN Workshop, 2020.
  27. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.
  28. Interactive image segmentation with first click attention. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 13339–13348, 2020.
  29. Sequential interactive image segmentation. Computational Visual Media, 9(4):753–765, 2023.
  30. isegformer: interactive segmentation via transformers with application to 3d knee mr images. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 464–474. Springer, 2022.
  31. Simpleclick: Interactive image segmentation with simple vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 22290–22300, 2023.
  32. Interactive scribble segmentation. In Proceedings of the Northern Lights Deep Learning Workshop, 2023.
  33. Detecting arbitrary keypoints on limbs and skis with sparse partly correct segmentation masks. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 461–470, 2023a.
  34. All keypoints you need: Detecting arbitrary keypoints on the body of triple, high, and long jump athletes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5178–5186, 2023b.
  35. Iteratively trained interactive segmentation. arXiv preprint arXiv:1805.04398, 2018.
  36. Content-aware multi-level guidance for interactive instance segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11602–11611, 2019.
  37. Deep extreme cut: From extreme points to object segmentation. In Computer Vision and Pattern Recognition (CVPR), 2018.
  38. Finely-grained annotated datasets for image-based plant phenotyping. Pattern recognition letters, 81:80–89, 2016.
  39. Doors: Dataset for boulders segmentation, 2022.
  40. Hierarchical approach for joint semantic, plant instance, and leaf instance segmentation in the agricultural domain. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 9601–9607. IEEE, 2023.
  41. A hybrid propagation network for interactive volumetric image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 673–682. Springer, 2022.
  42. Self-supervised interactive image segmentation. IEEE Transactions on Circuits and Systems for Video Technology, 2023.
  43. A stochastic polygons model for glandular structures in colon histology images. IEEE transactions on medical imaging, 34(11):2366–2378, 2015.
  44. Gland segmentation in colon histology images: The glas challenge contest. Medical image analysis, 35:489–502, 2017.
  45. f-brs: Rethinking backpropagating refinement for interactive segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8623–8632, 2020.
  46. Reviving iterative training with mask guidance for interactive segmentation. In 2022 IEEE International Conference on Image Processing (ICIP), pages 3141–3145. IEEE, 2022.
  47. Ecotta: Memory-efficient continual test-time adaptation via self-distilled regularization. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  48. Free-shape polygonal object localization. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VI 13, pages 317–332. Springer, 2014.
  49. Sam meets robotic surgery: An empirical study on generalization, robustness and adaptation, 2023.
  50. Tent: Fully test-time adaptation by entropy minimization. arXiv preprint arXiv:2006.10726, 2020.
  51. Interactive medical image segmentation using deep learning with image-specific fine tuning. IEEE transactions on medical imaging, 37(7):1562–1573, 2018a.
  52. Stacked conditional generative adversarial networks for jointly learning shadow detection and shadow removal. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1788–1797, 2018b.
  53. Continual test-time domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7201–7211, 2022.
  54. Medical sam adapter: Adapting segment anything model for medical image segmentation. arXiv preprint arXiv:2304.12620, 2023.
  55. Deep interactive object selection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 373–381, 2016.
  56. Mirrornet: Bio-inspired camouflaged object segmentation. IEEE Access, 9:43290–43300, 2021.
  57. Interactive object segmentation with inside-outside guidance. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12234–12244, 2020.
  58. Fast segment anything, 2023.
  59. Scene parsing through ade20k dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 633–641, 2017.
  60. Instance segmentation based 6d pose estimation of industrial objects using point clouds for robotic bin-picking. Robotics and Computer-Integrated Manufacturing, 82:102541, 2023.
Citations (3)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.