Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

How to build the best medical image segmentation algorithm using foundation models: a comprehensive empirical study with Segment Anything Model (2404.09957v2)

Published 15 Apr 2024 in cs.CV and cs.LG

Abstract: Automated segmentation is a fundamental medical image analysis task, which enjoys significant advances due to the advent of deep learning. While foundation models have been useful in natural language processing and some vision tasks for some time, the foundation model developed with image segmentation in mind - Segment Anything Model (SAM) - has been developed only recently and has shown similar promise. However, there are still no systematic analyses or "best-practice" guidelines for optimal fine-tuning of SAM for medical image segmentation. This work summarizes existing fine-tuning strategies with various backbone architectures, model components, and fine-tuning algorithms across 18 combinations, and evaluates them on 17 datasets covering all common radiology modalities. Our study reveals that (1) fine-tuning SAM leads to slightly better performance than previous segmentation methods, (2) fine-tuning strategies that use parameter-efficient learning in both the encoder and decoder are superior to other strategies, (3) network architecture has a small impact on final performance, (4) further training SAM with self-supervised learning can improve final model performance. We also demonstrate the ineffectiveness of some methods popular in the literature and further expand our experiments into few-shot and prompt-based settings. Lastly, we released our code and MRI-specific fine-tuned weights, which consistently obtained superior performance over the original SAM, at https://github.com/mazurowski-lab/finetune-SAM.

An Empirical Analysis of Fine-Tuning Approaches for Medical Image Segmentation Using the Segment Anything Model

The paper details a comprehensive empirical paper on the effectiveness of fine-tuning strategies for the Segment Anything Model (SAM) in medical image segmentation. The paper meticulously assesses various fine-tuning techniques across 18 configurations, which incorporate different encoder architectures, model components, and fine-tuning methodologies. These configurations are tested on 17 diverse datasets that encompass the main radiology modalities, providing a robust evaluation environment.

Key Findings and Methodological Insights

The paper concludes that fine-tuning SAM gives a marginal improvement over traditional segmentation models, highlighting the importance of parameter-efficient learning approaches. Specifically, the paper underscores that configurations where both encoder and decoder components undergo parameter-efficient learning tend to yield superior outcomes compared to other strategies. The small impact of network architecture on segmentation results is noteworthy, demonstrating that simpler models may suffice in capturing the essential features necessary for segmentation tasks in medical images. Furthermore, the incorporation of self-supervised learning shows promise in enhancing SAM’s performance when adapted for the medical domain.

The authors demonstrate the inefficacy of several conventional methods commonly cited in the literature, thereby challenging prevalent notions and urging a re-evaluation of best practices currently employed in medical image segmentation using foundation models. The paper notably extends to few-shot and prompt-based settings, emphasizing the scalable adaptability of SAM when fine-tuned under proposed guidelines. Such adaptability suggests potential advantages in scenarios with minimal labeled data, which are often encountered in medical imaging tasks.

Practical and Theoretical Implications

Practically, this research provides a roadmap for practitioners in the medical imaging field to effectively leverage SAM by outlining detailed fine-tuning strategies. The insights from this paper could inform the development of more robust and generalizable medical imaging applications, potentially accelerating the integration of SAM within clinical workflows. The minimal impact of network architecture size suggests that smaller models can be considered in resource-constrained environments without sacrificing performance, thus broadening the applicability of SAM-based segmentation models.

Theoretically, this work contributes to the ongoing discourse on the adaptability of foundation models from natural to specialized domains. By offering rigorous experimentation and analysis, the research provides foundational insights pertinent to the continued exploration of foundation models like SAM beyond general-purpose tasks. This could fuel further enhancements in self-supervised learning techniques and fine-tuning methodologies tailored for niche applications within the medical domain.

Future Directions in AI and Medical Imaging

Looking ahead, the nuances identified in the paper underscore the necessity for continued exploration of unsupervised and semi-supervised learning techniques that could enable foundation models like SAM to autonomously and efficiently adapt to specialized tasks. Future research may focus on integrating domain-specific knowledge within pre-training phases or embeddings to bridge the performance gap between general and specialized tasks.

Additionally, as data availability continues to challenge advancements in medical image analysis, expanding datasets with varied and representative samples could yield richer pre-training opportunities. This endeavor could involve harnessing synthetic data generation, federated learning, and multimodal datasets to cultivate more robust foundation models.

In conclusion, this paper presents a critical analysis of fine-tuning approaches in medical image segmentation using SAM. By highlighting successful strategies and underscoring areas for improvement, this work serves as a pivotal reference point for researchers and practitioners aiming to tailor foundation models for medical imaging applications, thereby enhancing diagnostic accuracy and efficiency.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (101)
  1. A fully integrated computer-aided diagnosis system for digital x-ray mammograms via deep learning detection, segmentation, and classification. International journal of medical informatics, 117:44–54, 2018.
  2. Deep learning approaches for data augmentation and classification of breast masses using ultrasound images. Int. J. Adv. Comput. Sci. Appl, 10(5):1–11, 2019.
  3. 2018 robotic scene segmentation challenge. arXiv preprint arXiv:2001.11190, 2020.
  4. The medical segmentation decathlon. Nature communications, 13(1):4128, 2022.
  5. Prostatex challenges for computerized classification of prostate lesions from multiparametric magnetic resonance images. Journal of Medical Imaging, 5(4):044501–044501, 2018.
  6. Robust and efficient medical imaging with self-supervision. arXiv preprint arXiv:2205.09723, 2022.
  7. Strong baselines for parameter efficient few-shot fine-tuning. arXiv preprint arXiv:2304.01917, 2023.
  8. Deep learning techniques for automatic mri cardiac multi-structures segmentation and diagnosis: is the problem solved? IEEE transactions on medical imaging, 37(11):2514–2525, 2018.
  9. Deep-learning-assisted diagnosis for knee magnetic resonance imaging: development and retrospective validation of mrnet. PLoS medicine, 15(11):e1002699, 2018.
  10. Mateusz Buda. Brain mri segmentation: Brain mri images together with manual flair abnormality segmentation masks. Kaggle, 2019.
  11. Sam3d: Segment anything model in volumetric medical images. arXiv preprint arXiv:2309.03493, 2023.
  12. Tinytl: Reduce memory, not parameters for efficient on-device learning. Advances in Neural Information Processing Systems, 33:11285–11297, 2020.
  13. Swin-unet: Unet-like pure transformer for medical image segmentation. In European conference on computer vision, pages 205–218. Springer, 2022.
  14. Ma-sam: Modality-agnostic sam adaptation for 3d medical image segmentation. arXiv preprint arXiv:2309.08842, 2023.
  15. Adaptformer: Adapting vision transformers for scalable visual recognition. Advances in Neural Information Processing Systems, 35:16664–16678, 2022.
  16. A simple framework for contrastive learning of visual representations. In Hal Daumé III and Aarti Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 1597–1607. PMLR, 13–18 Jul 2020.
  17. Sam fails to segment anything?–sam-adapter: Adapting sam in underperformed scenes: Camouflage, shadow, and more. arXiv preprint arXiv:2304.09148, 2023.
  18. Sam-med2d. arXiv preprint arXiv:2308.16184, 2023.
  19. All-in-sam: from weak annotation to pixel-wise nuclei segmentation with prompt-based finetuning. arXiv preprint arXiv:2307.00290, 2023.
  20. Automated renal segmentation in healthy and chronic kidney disease subjects using a convolutional neural network. Magnetic resonance in medicine, 86(2):1125–1136, 2021.
  21. Segment anything model (sam) for digital pathology: Assess zero-shot segmentation on whole slide imaging. ArXiv, abs/2304.04155, 2023.
  22. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  23. Parameter-efficient fine-tuning for medical image analysis: The missed opportunity. arXiv preprint arXiv:2305.08252, 2023.
  24. Multi-organ segmentation over partially labeled datasets with multi-scale feature abstraction. IEEE Transactions on Medical Imaging, 39(11):3619–3629, 2020.
  25. Cheap lunch for medical image segmentation by fine-tuning sam on few exemplars. arXiv preprint arXiv:2308.14133, 2023.
  26. On the effectiveness of parameter-efficient fine-tuning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 12799–12807, 2023.
  27. Desam: Decoupling segment anything model for generalizable medical image segmentation. arXiv preprint arXiv:2306.00499, 2023.
  28. 3dsam-adapter: Holistic adaptation of sam from 2d to 3d for promptable medical image segmentation. arXiv preprint arXiv:2306.13465, 2023.
  29. Segmentanybone: A universal model that segments any bone at any location on mri, 2024.
  30. Daniel Gut. X-ray images of the hip joints. 1, July 2021. Publisher: Mendeley Data.
  31. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16000–16009, 2022.
  32. Momentum contrast for unsupervised visual representation learning. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9726–9735, 2020.
  33. Isles 2022: A multi-center magnetic resonance imaging stroke lesion segmentation dataset. Scientific data, 9(1):762, 2022.
  34. Cellvit: Vision transformers for precise cell segmentation and classification. arXiv preprint arXiv:2306.15350, 2023.
  35. Parameter-efficient transfer learning for nlp. In International Conference on Machine Learning, pages 2790–2799. PMLR, 2019.
  36. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
  37. Skinsam: Empowering skin cancer segmentation with segment anything model. arXiv preprint arXiv:2304.13973, 2023.
  38. How to efficiently adapt large segmentation model (sam) to medical images. arXiv preprint arXiv:2306.13731, 2023.
  39. Segment anything model for medical images? Medical Image Analysis, 92:103061, 2024.
  40. nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nature methods, 18(2):203–211, 2021.
  41. Two public chest x-ray datasets for computer-aided screening of pulmonary diseases. Quantitative imaging in medicine and surgery, 4(6):475, 2014.
  42. Kvasir-seg: A segmented polyp dataset. In MultiMedia Modeling: 26th International Conference, MMM 2020, Daejeon, South Korea, January 5–8, 2020, Proceedings, Part II 26, pages 451–462. Springer, 2020.
  43. Anatomical invariance modeling and semantic alignment for self-supervised learning in 3d medical image analysis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 15859–15869, 2023.
  44. Chaos challenge-combined (ct-mr) healthy abdominal organ segmentation. Medical Image Analysis, 69:101950, 2021.
  45. Segment anything. arXiv preprint arXiv:2304.02643, 2023.
  46. The effect of intrinsic dataset properties on generalization: Unraveling learning differences between natural and medical images. In The Twelfth International Conference on Learning Representations (ICLR), 2024.
  47. A multi-organ nucleus segmentation challenge. IEEE transactions on medical imaging, 39(5):1380–1391, 2019.
  48. Miccai multi-atlas labeling beyond the cranial vault–workshop and challenge. In Proc. MICCAI Multi-Atlas Labeling Beyond Cranial Vault—Workshop Challenge, volume 5, page 12, 2015.
  49. Foundation models for biomedical image segmentation: A survey. arXiv preprint arXiv:2401.07654, 2024.
  50. Original multi-parametric mri images of prostate. 2016.
  51. Self-sampling meta sam: Enhancing few-shot medical image segmentation with meta-learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 7925–7935, 2024.
  52. A publicly available deep learning model and dataset for segmentation of breast, fibroglandular tissue, and vessels in breast mri. Scientific reports, 14 1:5383, 2024.
  53. Auto-prompting sam for mobile friendly 3d medical image segmentation. arXiv preprint arXiv:2308.14936, 2023.
  54. Cross-domain few-shot learning with task-specific adapters. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7161–7170, 2022.
  55. Polyp-sam: Transfer sam for polyp segmentation. arXiv preprint arXiv:2305.00293, 2023.
  56. Samus: Adapting segment anything model for clinically-friendly and generalizable ultrasound image segmentation. arXiv preprint arXiv:2309.06824, 2023.
  57. Extracting lungs from ct images via deep convolutional neural network based segmentation and two-pass contour refinement. Journal of Digital Imaging, 33:1465–1478, 2020.
  58. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
  59. Segment anything in medical images. Nature Communications, 15:1–9, 2024.
  60. Deep learning segmentation of transverse musculoskeletal ultrasound images for neuromuscular disease assessment. Computers in Biology and Medicine, 135:104623, 2021.
  61. Segment anything model for medical image analysis: an experimental study. Medical Image Analysis, 89:102918, 2023.
  62. The multimodal brain tumor image segmentation benchmark (brats). IEEE transactions on medical imaging, 34(10):1993–2024, 2014.
  63. Brain mri dataset of multiple sclerosis with consensus manual lesion segmentation and patient meta information. Data in Brief, 42:108139, 2022.
  64. Adaptivesam: Towards efficient tuning of sam for surgical scene segmentation. arXiv preprint arXiv:2308.03726, 2023.
  65. Contrastive learning for unpaired image-to-image translation. In European Conference on Computer Vision, pages 319–345. Springer, 2020.
  66. Episurg: a dataset of postoperative magnetic resonance images (mri) for quantitative analysis of resection neurosurgery for refractory epilepsy. university college london. DOI, 1(0.5522):04, 2020.
  67. Learnable ophthalmology sam. arXiv preprint arXiv:2304.13425, 2023.
  68. Learning multiple visual domains with residual adapters. Advances in neural information processing systems, 30, 2017.
  69. Efficient parametrization of multi-domain deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8119–8127, 2018.
  70. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
  71. A machine learning approach to radiogenomics of breast cancer: a study of 922 subjects and 529 dce-mri features. British journal of cancer, 119(4):508–516, 2018.
  72. Autosam: Adapting sam to medical images by overloading the prompt encoder, 2023.
  73. Generalist vision foundation models for medical imaging: A case study of segment anything model on zero-shot medical segmentation. Diagnostics, 13(11):1947, 2023.
  74. Cross-modality attention adapter: A glioma segmentation fine-tuning method for sam using multimodal brain mr images. arXiv preprint arXiv:2307.01124, 2023.
  75. U-net and its variants for medical image segmentation: A review of theory and applications. Ieee Access, 9:82031–82057, 2021.
  76. Sub-cortical structure segmentation database for young population. arXiv preprint arXiv:2111.01561, 2021.
  77. Ct2us: Cross-modal transfer learning for kidney segmentation in ultrasound images with synthesized data. Ultrasonics, 122:106706, 2022.
  78. Self-supervised pre-training of swin transformers for 3d medical image analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20730–20740, 2022.
  79. Three things everyone should know about vision transformers. In European Conference on Computer Vision, pages 497–515. Springer, 2022.
  80. The ham10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Scientific data, 5(1):1–9, 2018.
  81. Lumbar spine segmentation in mr images: a dataset and a public benchmark. arXiv preprint arXiv:2306.12217, 2023.
  82. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  83. Medical image segmentation using deep learning: A survey. IET Image Processing, 16(5):1243–1267, 2022.
  84. Fremim: Fourier transform meets masked image modeling for medical image segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 7860–7870, 2024.
  85. Deep learning for classification of thyroid nodules on ultrasound: validation on an independent dataset. Clinical Imaging, 99:60–66, 2023.
  86. Medical sam adapter: Adapting segment anything model for medical image segmentation. arXiv preprint arXiv:2304.12620, 2023.
  87. Self-prompting large vision models for few-shot medical image segmentation. In MICCAI Workshop on Domain Adaptation and Representation Transfer, pages 156–167. Springer, 2023.
  88. Sam fewshot finetuning for anatomical segmentation in medical images. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 3253–3261, 2024.
  89. Unimiss: Universal medical self-supervised learning via breaking dimensionality barrier. In European Conference on Computer Vision, pages 558–575. Springer, 2022.
  90. Simmim: A simple framework for masked image modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9653–9663, 2022.
  91. Parameter-efficient fine-tuning methods for pretrained language models: A critical review and assessment. arXiv preprint arXiv:2312.12148, 2023.
  92. Surgicalsam: Efficient class promptable surgical instrument segmentation. arXiv preprint arXiv:2308.08746, 2023.
  93. Bitfit: Simple parameter-efficient fine-tuning for transformer-based masked language-models. arXiv preprint arXiv:2106.10199, 2021.
  94. Faster segment anything: Towards lightweight sam for mobile applications. arXiv preprint arXiv:2306.14289, 2023.
  95. Sam-path: A segment anything model for semantic segmentation in digital pathology. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 161–170. Springer, 2023.
  96. Customized segment anything model for medical image segmentation. arXiv preprint arXiv:2304.13785, 2023.
  97. Blo-sam: Bi-level optimization based overfitting-preventing finetuning of sam. arXiv preprint arXiv:2402.16338, 2024.
  98. Towards segment anything model (sam) for medical image segmentation: a survey. arXiv [Preprint], 2023.
  99. Segment anything model for medical image segmentation: Current applications and future directions. Computers in Biology and Medicine, page 108238, 2024.
  100. Self pre-training with masked autoencoders for medical image classification and segmentation. In 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI), pages 1–6. IEEE, 2023.
  101. Advancing volumetric medical image segmentation via global-local masked autoencoder. arXiv preprint arXiv:2306.08913, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Hanxue Gu (22 papers)
  2. Haoyu Dong (55 papers)
  3. Jichen Yang (28 papers)
  4. Maciej A. Mazurowski (51 papers)
Citations (7)
Github Logo Streamline Icon: https://streamlinehq.com