Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fantastic Animals and Where to Find Them: Segment Any Marine Animal with Dual SAM (2404.04996v1)

Published 7 Apr 2024 in cs.CV and cs.MM

Abstract: As an important pillar of underwater intelligence, Marine Animal Segmentation (MAS) involves segmenting animals within marine environments. Previous methods don't excel in extracting long-range contextual features and overlook the connectivity between discrete pixels. Recently, Segment Anything Model (SAM) offers a universal framework for general segmentation tasks. Unfortunately, trained with natural images, SAM does not obtain the prior knowledge from marine images. In addition, the single-position prompt of SAM is very insufficient for prior guidance. To address these issues, we propose a novel feature learning framework, named Dual-SAM for high-performance MAS. To this end, we first introduce a dual structure with SAM's paradigm to enhance feature learning of marine images. Then, we propose a Multi-level Coupled Prompt (MCP) strategy to instruct comprehensive underwater prior information, and enhance the multi-level features of SAM's encoder with adapters. Subsequently, we design a Dilated Fusion Attention Module (DFAM) to progressively integrate multi-level features from SAM's encoder. Finally, instead of directly predicting the masks of marine animals, we propose a Criss-Cross Connectivity Prediction (C$3$P) paradigm to capture the inter-connectivity between discrete pixels. With dual decoders, it generates pseudo-labels and achieves mutual supervision for complementary feature representations, resulting in considerable improvements over previous techniques. Extensive experiments verify that our proposed method achieves state-of-the-art performances on five widely-used MAS datasets. The code is available at https://github.com/Drchip61/Dual_SAM.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (89)
  1. Speeded-up robust features (surf). CVIU, 110(3):346–359, 2008.
  2. Transunet: Transformers make strong encoders for medical image segmentation. arXiv, 2021.
  3. Rsprompter: Learning to prompt for remote sensing instance segmentation based on visual foundation model. arXiv, 2023a.
  4. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE TPAMI, 40(4):834–848, 2017.
  5. A robust object segmentation network for underwater scenes. In ICASSP, pages 2629–2633. IEEE, 2022.
  6. Sam fails to segment anything?–sam-adapter: Adapting sam in underperformed scenes: Camouflage, shadow, and more. arXiv, 2023b.
  7. A highly efficient model to study the semantics of salient object detection. PAMI, 44(11):8006–8021, 2021.
  8. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv, 2020.
  9. Underwater image segmentation in the wild using deep learning. Journal of the Brazilian Computer Society, 27:1–14, 2021.
  10. Enhanced-alignment measure for binary foreground map evaluation. arXiv preprint arXiv:1805.10421, 2018.
  11. Camouflaged object detection. In CVPR, pages 2777–2787, 2020a.
  12. Rethinking rgb-d salient object detection: Models, data sets, and large-scale benchmarks. TNNLS, 32(5):2075–2089, 2020b.
  13. Bbs-net: Rgb-d salient object detection with a bifurcated backbone strategy network. In ECCV, pages 275–292. Springer, 2020c.
  14. Jl-dcf: Joint learning and densely-cooperative fusion framework for rgb-d salient object detection. In CVPR, pages 3052–3062, 2020.
  15. Masnet: A robust deep marine animal segmentation network. IEEE Journal of Oceanic Engineering, 2023.
  16. Desam: Decoupling segment anything model for generalizable medical image segmentation. arXiv, 2023.
  17. H2former: An efficient hierarchical hybrid transformer for medical image segmentation. TMI, 2023.
  18. Deep residual learning for image recognition. In CVPR, pages 770–778, 2016.
  19. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, 2016.
  20. Usod10k: a new benchmark dataset for underwater salient object detection. TIP, 2023.
  21. Parameter-efficient transfer learning for nlp. In ICML, pages 2790–2799. PMLR, 2019.
  22. Lora: Low-rank adaptation of large language models. arXiv, 2021.
  23. Densely connected convolutional networks. In CVPR, pages 4700–4708, 2017.
  24. Simultaneous enhancement and super-resolution of underwater imagery for improved visual perception. arXiv, 2020a.
  25. Svam: saliency-guided visual attention modeling by autonomous underwater robots. arXiv, 2020b.
  26. A model of saliency-based visual attention for rapid scene analysis. PAMI, 20(11):1254–1259, 1998.
  27. Calibrated rgb-d salient object detection. In CVPR, pages 9471–9481, 2021.
  28. Let segment anything help image dehaze. arXiv, 2023.
  29. Connnet: A long-range relation-aware pixel-connectivity network for salient segmentation. TIP, 28(5):2518–2529, 2018.
  30. Segment anything. arXiv, 2023.
  31. Detect any deepfakes: Segment anything meets face forgery detection and localization. arXiv, 2023.
  32. Robust tracking of multiple objects in sector-scan sonar image sequences using optical flow motion estimation. IEEE Journal of Oceanic Engineering, 23(1):31–46, 1998.
  33. Medlsam: Localize and segment anything model for 3d medical images. arXiv, 2023.
  34. Hierarchical alternate interaction network for rgb-d salient object detection. TIP, 30:3528–3542, 2021a.
  35. Mas3k: An open dataset for marine animal segmentation. In International Symposium on Benchmarking, Measuring and Optimization, pages 194–212. Springer, 2020.
  36. Marine animal segmentation. TCSVT, 32(4):2303–2314, 2021b.
  37. Feature pyramid networks for object detection. In ICCV, pages 2117–2125, 2017.
  38. Modeling aleatoric uncertainty for camouflaged object detection. In WACV, pages 1445–1454, 2022.
  39. A simple pooling-based design for real-time salient object detection. In CVPR, pages 3917–3926, 2019.
  40. Underwater image saliency detection via attention-based mechanism. In Journal of Physics: Conference Series, page 012012. IOP Publishing, 2022.
  41. Learning selective mutual attention and contrast for rgb-d saliency detection. TPAMI, 44(12):9026–9042, 2021a.
  42. Swin transformer: Hierarchical vision transformer using shifted windows. In ICCV, pages 10012–10022, 2021b.
  43. Tritransnet: Rgb-d salient object detection with a triplet transformer embedding network. In ACMMM, pages 4481–4490, 2021c.
  44. Decoupled weight decay regularization. arXiv, 2017.
  45. Simultaneously localize, segment and rank the camouflaged objects. In CVPR, pages 11591–11601, 2021.
  46. Pyramidal feature shrinking for salient object detection. In AAAI, pages 2311–2318, 2021.
  47. Camouflaged object segmentation with distraction mining. In CVPR, pages 8772–8781, 2021.
  48. Sift: Predicting amino acid changes that affect protein function. NAS, 31(13):3812–3814, 2003.
  49. Multi-scale interactive network for salient object detection. In CVPR, pages 9413–9422, 2020.
  50. Zoom in and out: A mixed-scale triplet network for camouflaged object detection. In CVPR, pages 2160–2170, 2022.
  51. Depth-induced multi-scale recurrent attention network for saliency detection. In ICCV, pages 7254–7263, 2019.
  52. Mfnet: Multi-filter directive network for weakly supervised salient object detection. In ICCV, pages 4136–4145, 2021.
  53. Underwater object detection and tracking. In Soft Computing, pages 837–846. Springer, 2020.
  54. Object detection in underwater acoustic images using edge based segmentation method. Procedia Computer Science, 165:759–765, 2019.
  55. Basnet: Boundary-aware salient object detection. In CVPR, pages 7479–7489, 2019.
  56. U2-net: Going deeper with nested u-structure for salient object detection. PR, 106:107404, 2020.
  57. Rgbd salient object detection via deep fusion. TIP, 26(5):2274–2285, 2017.
  58. Vision transformers for dense prediction. In ICCV, pages 12179–12188, 2021.
  59. Robustness of segment anything model (sam) for autonomous driving in adverse weather conditions. arXiv, 2023.
  60. Automated classification and thematic mapping of bacterial mats in the north sea. In OCEANS, pages 1–8. IEEE, 2013.
  61. Context-aware cross-level fusion network for camouflaged object detection. arXiv, 2021.
  62. Progressive feature polishing network for salient object detection. In AAAI, pages 12128–12135, 2020.
  63. Image quality assessment: from error visibility to structural similarity. IEEE TIP, 13(4):600–612, 2004.
  64. F33{}^{3}start_FLOATSUPERSCRIPT 3 end_FLOATSUPERSCRIPTnet: fusion, feedback and focus for salient object detection. In AAAI, pages 12321–12328, 2020a.
  65. Label decoupling framework for salient object detection. In CVPR, pages 13025–13034, 2020b.
  66. Cascaded partial decoder for fast and accurate salient object detection. In CVPR, pages 3907–3916, 2019a.
  67. Stacked cross refinement network for edge-aware salient object detection. In ICCV, pages 7264–7273, 2019b.
  68. Locate globally, segment locally: A progressive architecture with knowledge review network for salient object detection. In AAAI, pages 3004–3012, 2021.
  69. Aquasam: Underwater image foreground segmentation. arXiv, 2023.
  70. Fully transformer network for change detection of remote sensing images. In ACCV, pages 1691–1708, 2022.
  71. Transy-net: Learning fully transformer networks for change detection of remote sensing images. TGRS, 61:1–12, 2023.
  72. Progressive self-guided loss for salient object detection. TIP, 30:8426–8438, 2021.
  73. Reversion correction and regularized random walk ranking for saliency detection. TIP, 27(3):1311–1322, 2017.
  74. Cross-modality discrepant interaction network for rgb-d salient object detection. In ACMMM, pages 2094–2102, 2021a.
  75. Few-cost salient object detection with adversarial-paced learning. ANIPS, 33:12236–12247, 2020a.
  76. Sam3d: Zero-shot 3d object detection via segment anything model. arXiv, 2023a.
  77. Uc-net: Uncertainty inspired rgb-d saliency detection via conditional variational autoencoders. In CVPR, pages 8582–8591, 2020b.
  78. Customized segment anything model for medical image segmentation. arXiv, 2023.
  79. Segment anything model (sam) for radiation oncology. arXiv, 2023b.
  80. Bts-net: Bi-directional transfer-and-selection network for rgb-d salient object detection. In ICME, pages 1–6. IEEE, 2021b.
  81. Is depth really necessary for salient object detection? In ACMMM, pages 1745–1754, 2020a.
  82. Egnet: Edge guidance network for salient object detection. In ICCV, pages 8779–8788, 2019.
  83. Enlighten-anything: When segment anything model meets low-light image enhancement. arXiv, 2023.
  84. Pyramid feature attention network for saliency detection. In CVPR, pages 3085–3094, 2019.
  85. A single stream network for robust and real-time rgb-d salient object detection. In ECCV, pages 646–662. Springer, 2020b.
  86. Complementary trilateral decoder for fast and accurate salient object detection. In ACMMM, pages 4967–4975, 2021.
  87. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In CVPR, pages 6881–6890, 2021.
  88. Specificity-preserving rgb-d saliency detection. In ICCV, pages 4681–4691, 2021.
  89. Unet++: A nested u-net architecture for medical image segmentation. In MICCAI, pages 3–11. Springer, 2018.
Citations (4)

Summary

We haven't generated a summary for this paper yet.