Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unleashing the Potential of SAM for Medical Adaptation via Hierarchical Decoding (2403.18271v1)

Published 27 Mar 2024 in cs.CV

Abstract: The Segment Anything Model (SAM) has garnered significant attention for its versatile segmentation abilities and intuitive prompt-based interface. However, its application in medical imaging presents challenges, requiring either substantial training costs and extensive medical datasets for full model fine-tuning or high-quality prompts for optimal performance. This paper introduces H-SAM: a prompt-free adaptation of SAM tailored for efficient fine-tuning of medical images via a two-stage hierarchical decoding procedure. In the initial stage, H-SAM employs SAM's original decoder to generate a prior probabilistic mask, guiding a more intricate decoding process in the second stage. Specifically, we propose two key designs: 1) A class-balanced, mask-guided self-attention mechanism addressing the unbalanced label distribution, enhancing image embedding; 2) A learnable mask cross-attention mechanism spatially modulating the interplay among different image regions based on the prior mask. Moreover, the inclusion of a hierarchical pixel decoder in H-SAM enhances its proficiency in capturing fine-grained and localized details. This approach enables SAM to effectively integrate learned medical priors, facilitating enhanced adaptation for medical image segmentation with limited samples. Our H-SAM demonstrates a 4.78% improvement in average Dice compared to existing prompt-free SAM variants for multi-organ segmentation using only 10% of 2D slices. Notably, without using any unlabeled data, H-SAM even outperforms state-of-the-art semi-supervised models relying on extensive unlabeled training data across various medical datasets. Our code is available at https://github.com/Cccccczh404/H-SAM.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (77)
  1. Transdeeplab: Convolution-free transformer-based deeplab v3+ for medical image segmentation. In International Workshop on PRedictive Intelligence In MEdicine, pages 91–102. Springer, 2022.
  2. Dae-former: Dual attention-guided efficient transformer for medical image segmentation. In International Workshop on PRedictive Intelligence In MEdicine, pages 83–95. Springer, 2023.
  3. Bidirectional copy-paste for semi-supervised medical image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11514–11524, 2023.
  4. Swin-unet: Unet-like pure transformer for medical image segmentation. In European conference on computer vision, pages 205–218. Springer, 2022.
  5. Ladder fine-tuning approach for sam integrating complementary network. arXiv preprint arXiv:2306.12737, 2023.
  6. Multi-task learning for left atrial segmentation on ge-mri. In Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges: 9th International Workshop, STACOM 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Revised Selected Papers 9, pages 292–301. Springer, 2019.
  7. Magicnet: Semi-supervised multi-organ segmentation via magic-cube partition and recovery. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23869–23878, 2023a.
  8. Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306, 2021.
  9. 3d transunet: Advancing medical image segmentation through vision transformers. arXiv preprint arXiv:2310.07781, 2023b.
  10. Sam fails to segment anything? – sam-adapter: Adapting sam in underperformed scenes: Camouflage, shadow, and more, 2023c.
  11. Vision transformer adapter for dense predictions. arXiv preprint arXiv:2205.08534, 2022.
  12. Masked-attention mask transformer for universal image segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1290–1299, 2022a.
  13. Sam on medical images: A comprehensive study on three prompt modes. arXiv preprint arXiv:2305.00035, 2023.
  14. Resganet: Residual group attention network for medical image classification and segmentation. Medical Image Analysis, 76:102313, 2022b.
  15. All-in-sam: from weak annotation to pixel-wise nuclei segmentation with prompt-based finetuning. arXiv preprint arXiv:2307.00290, 2023.
  16. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nature medicine, 24(9):1342–1350, 2018.
  17. Sam-u: Multi-box prompts triggered uncertainty estimation for reliable sam in medical image. arXiv preprint arXiv:2307.04973, 2023a.
  18. Segment anything model (sam) for digital pathology: Assess zero-shot segmentation on whole slide imaging. arXiv preprint arXiv:2304.04155, 2023b.
  19. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  20. Resunet-a: A deep learning framework for semantic segmentation of remotely sensed data. ISPRS Journal of Photogrammetry and Remote Sensing, 162:94–114, 2020.
  21. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  22. Cheap lunch for medical image segmentation by fine-tuning sam on few exemplars. arXiv preprint arXiv:2308.14133, 2023.
  23. A review of deep learning based methods for medical image multi-organ segmentation. Physica Medica, 85:107–122, 2021.
  24. Desam: Decoupling segment anything model for generalizable medical image segmentation. arXiv preprint arXiv:2306.00499, 2023.
  25. 3dsam-adapter: Holistic adaptation of sam from 2d to 3d for promptable medical image segmentation. arXiv preprint arXiv:2306.13465, 2023.
  26. Accuracy of segment-anything model (sam) in medical image segmentation tasks. arXiv preprint arXiv:2304.09324, 2023.
  27. Parameter-efficient transfer learning for nlp. In International Conference on Machine Learning, pages 2790–2799. PMLR, 2019.
  28. When sam meets medical images: An investigation of segment anything model (sam) on multi-phase liver tumor segmentation. arXiv preprint arXiv:2304.08506, 2023.
  29. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
  30. Skinsam: Empowering skin cancer segmentation with segment anything model. arXiv preprint arXiv:2304.13973, 2023a.
  31. How to efficiently adapt large segmentation model (sam) to medical images. arXiv preprint arXiv:2306.13731, 2023b.
  32. Missformer: An effective medical image segmentation transformer. arXiv preprint arXiv:2109.07162, 2021.
  33. Segment anything model for medical images? arXiv preprint arXiv:2304.14660, 2023.
  34. nnu-net: Self-adapting framework for u-net-based medical image segmentation. arXiv preprint arXiv:1809.10486, 2018.
  35. nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nature methods, 18(2):203–211, 2021.
  36. Sam struggles in concealed scenes–empirical study on" segment anything". arXiv preprint arXiv:2304.06022, 2023.
  37. Amos: A large-scale abdominal multi-organ benchmark for versatile medical image segmentation. Advances in Neural Information Processing Systems, 35:36722–36732, 2022.
  38. Segment anything. arXiv preprint arXiv:2304.02643, 2023.
  39. Miccai multi-atlas labeling beyond the cranial vault–workshop and challenge. In Proc. MICCAI Multi-Atlas Labeling Beyond Cranial Vault—Workshop Challenge, page 12, 2015.
  40. Medlsam: Localize and segment anything model for 3d medical images. arXiv preprint arXiv:2306.14752, 2023.
  41. Auto-prompting sam for mobile friendly 3d medical image segmentation. arXiv preprint arXiv:2308.14936, 2023a.
  42. Long-tailed visual recognition via gaussian clouded logit adjustment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6929–6938, 2022.
  43. Shape-aware semi-supervised 3d semantic segmentation for medical images. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part I 23, pages 552–561. Springer, 2020.
  44. Polyp-sam: Transfer sam for polyp segmentation. arXiv preprint arXiv:2305.00293, 2023b.
  45. Few shot medical image segmentation with cross attention transformer. arXiv preprint arXiv:2303.13867, 2023.
  46. Evaluation of prostate segmentation algorithms for mri: the promise12 challenge. Medical image analysis, 18(2):359–373, 2014.
  47. Samm (segment any medical model): A 3d slicer integration to sam. arXiv preprint arXiv:2304.05622, 2023.
  48. Semi-supervised medical image segmentation through dual-task consistency. In Proceedings of the AAAI conference on artificial intelligence, pages 8801–8809, 2021a.
  49. Efficient semi-supervised gross target volume of nasopharyngeal carcinoma segmentation via uncertainty rectified pyramid consistency. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part II 24, pages 318–329. Springer, 2021b.
  50. Segment anything in medical images. arXiv preprint arXiv:2304.12306, 2023.
  51. Segment anything model for medical image analysis: an experimental study. Medical Image Analysis, 89:102918, 2023.
  52. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In 2016 fourth international conference on 3D vision (3DV), pages 565–571. Ieee, 2016.
  53. Sam vs bet: A comparative study for brain extraction and segmentation of magnetic resonance images using deep learning. arXiv preprint arXiv:2304.04738, 2:4, 2023.
  54. Video-based ai for beat-to-beat assessment of cardiac function. Nature, 580(7802):252–256, 2020.
  55. Self-paced contrastive learning for semi-supervised medical image segmentation with meta-labels. Advances in Neural Information Processing Systems, 34:16686–16699, 2021.
  56. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  57. G-cascade: Efficient cascaded graph convolutional decoding for 2d medical image segmentation. arXiv preprint arXiv:2310.16175, 2023a.
  58. Multi-scale hierarchical vision transformer with cascaded attention decoding for medical image segmentation. arXiv preprint arXiv:2303.16892, 2023b.
  59. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
  60. Recurrent mask refinement for few-shot medical image segmentation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 3918–3928, 2021.
  61. Sam. md: Zero-shot medical image segmentation capabilities of the segment anything model. In Medical Imaging with Deep Learning, short paper track, 2023.
  62. Sam meets robotic surgery: An empirical study in robustness perspective. arXiv preprint arXiv:2304.14674, 2023.
  63. Abdominal multi-organ segmentation with organ-attention networks and statistical fusion. Medical image analysis, 55:88–102, 2019.
  64. Consistency-guided meta-learning for bootstrapping semi-supervised medical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 183–193. Springer, 2023.
  65. Medical sam adapter: Adapting segment anything model for medical image segmentation. arXiv preprint arXiv:2304.12620, 2023.
  66. Semi-supervised left atrium segmentation with mutual consistency training. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part II 24, pages 297–306. Springer, 2021.
  67. Exploring smoothness and class-separation for semi-supervised medical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 34–43. Springer, 2022.
  68. Uncertainty-aware self-ensembling model for semi-supervised 3d left atrium segmentation. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part II 22, pages 605–613. Springer, 2019.
  69. Review of deep learning approaches for the segmentation of multiple sclerosis lesions on brain mri. Frontiers in Neuroinformatics, 14:610967, 2020.
  70. Sam-path: A segment anything model for semantic segmentation in digital pathology. arXiv preprint arXiv:2307.09570, 2023a.
  71. Customized segment anything model for medical image segmentation. arXiv preprint arXiv:2304.13785, 2023.
  72. Segment anything model (sam) for radiation oncology. arXiv preprint arXiv:2306.11730, 2023b.
  73. Self-sampling meta sam: Enhancing few-shot medical image segmentation with meta-learning. arXiv preprint arXiv:2308.16466, 2023c.
  74. Input augmentation with sam: Boosting medical image segmentation with segmentation foundation model. arXiv preprint arXiv:2304.11332, 2023d.
  75. Prior-aware neural network for partially-supervised multi-organ segmentation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10672–10681, 2019a.
  76. Semi-supervised 3d abdominal multi-organ segmentation via deep multi-planar co-training. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 121–140. IEEE, 2019b.
  77. Unet++: A nested u-net architecture for medical image segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 20, 2018, Proceedings 4, pages 3–11. Springer, 2018.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Zhiheng Cheng (3 papers)
  2. Qingyue Wei (8 papers)
  3. Hongru Zhu (8 papers)
  4. Yan Wang (733 papers)
  5. Liangqiong Qu (31 papers)
  6. Wei Shao (95 papers)
  7. Yuyin Zhou (92 papers)
Citations (12)
X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com