Segment Anything Model for Medical Images? (2304.14660v7)

Published 28 Apr 2023 in eess.IV, cs.CV, and cs.LG

Abstract: The Segment Anything Model (SAM) is the first foundation model for general image segmentation. It has achieved impressive results on various natural image segmentation tasks. However, medical image segmentation (MIS) is more challenging because of the complex modalities, fine anatomical structures, uncertain and complex object boundaries, and wide-range object scales. To fully validate SAM's performance on medical data, we collected and sorted 53 open-source datasets and built a large medical segmentation dataset with 18 modalities, 84 objects, 125 object-modality paired targets, 1050K 2D images, and 6033K masks. We comprehensively analyzed different models and strategies on the so-called COSMOS 1050K dataset. Our findings mainly include the following: 1) SAM showed remarkable performance in some specific objects but was unstable, imperfect, or even totally failed in other situations. 2) SAM with the large ViT-H showed better overall performance than that with the small ViT-B. 3) SAM performed better with manual hints, especially box, than the Everything mode. 4) SAM could help human annotation with high labeling quality and less time. 5) SAM was sensitive to the randomness in the center point and tight box prompts, and may suffer from a serious performance drop. 6) SAM performed better than interactive methods with one or a few points, but will be outpaced as the number of points increases. 7) SAM's performance correlated to different factors, including boundary complexity, intensity differences, etc. 8) Finetuning the SAM on specific medical tasks could improve its average DICE performance by 4.39% and 6.68% for ViT-B and ViT-H, respectively. We hope that this comprehensive report can help researchers explore the potential of SAM applications in MIS, and guide how to appropriately use and develop SAM.

PDF HTML Abstract

Evaluation of the Segment Anything Model on a Diverse Medical Image Dataset

The paper "Segment Anything Model for Medical Images?" presents a rigorous evaluation of the Segment Anything Model (SAM) across a substantial medical image segmentation dataset, termed COSMOS 1050K. This dataset, developed by Yuhao Huang and colleagues, is one of the most extensive collections available, encompassing 53 public datasets with 18 distinct modalities and 84 categories of objects. The paper sheds light on the relative proficiency of SAM in medical image segmentation (MIS), an area marked by complex object structures and boundaries.

Key Findings and Methodology

The research delineates a comprehensive evaluation of SAM using two model versions, ViT-B and ViT-H, along with six distinct testing strategies ranging from an automatic mode to several manual prompt modes. The evaluations utilize common metrics, including DICE Coefficient, Jaccard Similarity Coefficient, and Hausdorff Distance. The paper's findings are as follows:

Model Size and Performance: ViT-H, a larger model compared to ViT-B, generally outperformed its smaller counterpart in all evaluation modes. This indicates that the superior capacity of ViT-H, facilitated by its larger parameter space, is better suited for the nuanced task of MIS.
Automatic vs. Manual Prompts: SAM without prompts (Everything mode) showed notable deficiencies across different datasets. However, performance significantly improved when manual hints, particularly box prompts, were incorporated. This suggests that while SAM holds potential in zero-shot scenarios, its robustness in medical settings is enhanced through guided input.
Impact of Object Attributes: The analysis of SAM’s segmentation performance correlates with object characteristics such as boundary complexity and intensity contrast. The results revealed a moderate influence of these traits on the efficiency of SAM, highlighting areas where model refinements could increase accuracy.
Task-specific Finetuning: Finetuning SAM within specific medical tasks led to notable performance gains, with improvements in DICE scores for both ViT-B and ViT-H models. This underscores the potential for SAM to become a strong performer in focused medical applications through tailored training.
Annotation Aid and Time Efficiency: The paper demonstrated SAM's capability to support human annotators by reducing labeling time and improving quality, a critical factor for large-scale medical data annotation.

Implications and Future Directions

The implications of the findings are manifold. Practically, SAM offers a substantial starting point for developing more efficient medical segmentation tools, with its ability to be finetuned and assist human annotators. Theoretically, the paper sets a precedence for evaluating foundation models on medical datasets, encouraging further research into domain-specific adaptations of general models like SAM.

For future developments, several avenues are highlighted. Firstly, improving SAM's interactive capabilities to handle multi-round prompt scenarios could bolster its utility in medical contexts. Additionally, exploring end-to-end semantic segmentation using combinations of SAM with models like CLIP or OVOD could pave the way for robust medical object classification and identification.

As SAM's performance still shows variability across modalities and tasks, subsequent efforts may focus on leveraging synthetic data generation to supplement medical datasets, enabling better training for zero-shot capabilities. Furthermore, enhancing SAM's adaptability to both 2D and 3D data would address the breadth of medical imaging modalities.

In conclusion, while SAM demonstrates potential for medical image segmentation, this research identifies significant opportunities for further enhancement and adaptation to meet the diverse demands of the field. The paper provides a crucial framework for researchers looking to extend the application of foundational models within the domain of medical imaging.

PDF Markdown Bookmark Chat (Pro)

References (101)

Authors (19)

Yuhao Huang (46 papers)
Xin Yang (312 papers)
Lian Liu (25 papers)
Han Zhou (72 papers)
Ao Chang (7 papers)
Xinrui Zhou (14 papers)
Rusi Chen (6 papers)
Junxuan Yu (3 papers)
Jiongquan Chen (5 papers)
Chaoyu Chen (70 papers)
Sijing Liu (16 papers)
Haozhe Chi (5 papers)
Xindi Hu (14 papers)
Kejuan Yue (2 papers)
Lei Li (1293 papers)
Vicente Grau (28 papers)
Deng-Ping Fan (88 papers)
Fajin Dong (2 papers)
Dong Ni (89 papers)

Citations (27)

View on Semantic Scholar

Segment Anything Model for Medical Images? (2304.14660v7)

Evaluation of the Segment Anything Model on a Diverse Medical Image Dataset

Key Findings and Methodology

Implications and Future Directions

Related Papers