Input Augmentation with SAM: Boosting Medical Image Segmentation with Segmentation Foundation Model (2304.11332v2)

Published 22 Apr 2023 in cs.CV, cs.AI, and cs.LG

Abstract: The Segment Anything Model (SAM) is a recently developed large model for general-purpose segmentation for computer vision tasks. SAM was trained using 11 million images with over 1 billion masks and can produce segmentation results for a wide range of objects in natural scene images. SAM can be viewed as a general perception model for segmentation (partitioning images into semantically meaningful regions). Thus, how to utilize such a large foundation model for medical image segmentation is an emerging research target. This paper shows that although SAM does not immediately give high-quality segmentation for medical image data, its generated masks, features, and stability scores are useful for building and training better medical image segmentation models. In particular, we demonstrate how to use SAM to augment image input for commonly-used medical image segmentation models (e.g., U-Net). Experiments on three segmentation tasks show the effectiveness of our proposed SAMAug method. The code is available at \url{https://github.com/yizhezhang2000/SAMAug}.

PDF HTML Abstract

Enhancing Medical Image Segmentation with the SAMAug Method

Recent developments in the field of computer vision have introduced a potential avenue for improving medical image segmentation through foundation models. In particular, the Segment Anything Model (SAM) emerges as a foundational tool capable of general-purpose segmentation across diverse visual contexts. Despite its broad application potential, direct application of SAM to medical images has shown limited success. The paper "Input Augmentation with SAM: Boosting Medical Image Segmentation with Segmentation Foundation Model" systematically explores the leverage of SAM-generated features to enhance the performance of specialized medical image segmentation models through a proposed method termed SAMAug.

The core challenge addressed in this work is the domain-specific intricacies that typical general-purpose models fail to capture, particularly in medical imaging where expert knowledge is crucial. SAM, trained with an extensive dataset—11 million images and over 1 billion masks—provides segmentation outputs, including features, stability scores, and masks, yet initial tests indicate inadequate performance on medical datasets directly.

SAMAug Methodology and Implementation

The SAMAug method focuses on augmenting input images using SAM-generated segmentation prior maps and boundary prior maps. These maps enrich the original input by embedding semantically relevant structures derived from SAM, hence altering conventional data augmentation paradigms which primarily rely on transformations such as rotations or cropping.

This augmentation is seamlessly integrated into the input channel of conventional medical image segmentation models like U-Net. SAMAug does not modify the parameters of SAM itself, ensuring that computation and memory footprints remain minimal compared to approaches involving fine-tuning or additional adaptation layers. Model training with SAMAug combines augmented and raw image inputs, optimizing the model’s performance via a loss function that weights contributions from different input types. This dual-input methodology allows segmentation models to benefit both from raw and SAM-augmented data, particularly impacting scenarios where SAM alone underperforms.

Experimental Validation

Empirical evidence of SAMAug's merits is presented through experiments across three benchmark datasets: Polyp, MoNuSeg, and GlaS. In polyp segmentation, SAMAug demonstrates a notable improvement in the Dice scores over the adept HSNet. For cell segmentation tasks on the MoNuSeg dataset, SAMAug enhances Advanced Jaccard Index (AJI) and F-score metrics across multiple neural network architectures, including U-Net and Attention UNet. Similarly, in gland segmentation on the GlaS dataset, SAMAug achieves higher F-score and Object Dice measures compared to traditional approaches.

These results suggest that SAMAug significantly enhances segmentation accuracy and outline its effectiveness in mitigating the shortcomings of SAM when applied straightforwardly to medical images. Visual cases substantiate that SAM's segmentation serves as a robust prior, catalyzing the downstream models to achieve higher fidelity in segmentation outputs.

Implications and Future Directions

By demonstrating an efficient and effective augmentation strategy, the paper establishes foundational work to integrate SAM with existing medical segmentation pipelines. The implications extend both practically—where existing models can be even more finely tuned for precise medical prediction—and theoretically—introducing a pathway to bolster the utility of foundation models for specialized tasks.

Future research directions might investigate optimizing the augmentation function for greater robustness, adapting SAMAug to handle other medical imaging modalities, and leveraging the learned representations for uncertainty estimation and other predictive applications within clinical settings. Moreover, scalability to real-time applications and integration within automated diagnostic systems would significantly broaden the utility of the SAMAug approach.

In summary, SAMAug marks a strategic enhancement in the field of medical image processing, capitalizing on SAM’s perceptual abilities and the domain-specific requirements of medical segmentation tasks. This underscores the potential foundation models have for elevating medical imaging standards when appropriately augmented with task-specific methodologies.

PDF Markdown Bookmark Chat (Pro)

References (32)

Authors (5)

Yizhe Zhang (127 papers)
Tao Zhou (398 papers)
Shuo Wang (382 papers)
Peixian Liang (12 papers)
Danny Z. Chen (72 papers)

Citations (63)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - yizhezhang2000/SAMAug (84 stars)