Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MedLSAM: Localize and Segment Anything Model for 3D CT Images (2306.14752v4)

Published 26 Jun 2023 in cs.CV

Abstract: Recent advancements in foundation models have shown significant potential in medical image analysis. However, there is still a gap in models specifically designed for medical image localization. To address this, we introduce MedLAM, a 3D medical foundation localization model that accurately identifies any anatomical part within the body using only a few template scans. MedLAM employs two self-supervision tasks: unified anatomical mapping (UAM) and multi-scale similarity (MSS) across a comprehensive dataset of 14,012 CT scans. Furthermore, we developed MedLSAM by integrating MedLAM with the Segment Anything Model (SAM). This innovative framework requires extreme point annotations across three directions on several templates to enable MedLAM to locate the target anatomical structure in the image, with SAM performing the segmentation. It significantly reduces the amount of manual annotation required by SAM in 3D medical imaging scenarios. We conducted extensive experiments on two 3D datasets covering 38 distinct organs. Our findings are twofold: 1) MedLAM can directly localize anatomical structures using just a few template scans, achieving performance comparable to fully supervised models; 2) MedLSAM closely matches the performance of SAM and its specialized medical adaptations with manual prompts, while minimizing the need for extensive point annotations across the entire dataset. Moreover, MedLAM has the potential to be seamlessly integrated with future 3D SAM models, paving the way for enhanced segmentation performance. Our code is public at \href{https://github.com/openmedlab/MedLSAM}

Citations (3)

Summary

  • The paper introduces MedLSAM, which combines a novel 3D localization model (MedLAM) with SAM to achieve automated CT segmentation with minimal manual input.
  • It leverages Unified Anatomical Mapping and Multi-Scale Similarity tasks to localize anatomical structures, matching fully-supervised performance on 14,012 CT scans.
  • The integration of Sub-Patch Localization refines segmentation by dividing structures into smaller segments, notably improving head-and-neck imaging accuracy.

MedLSAM: Localize and Segment Anything Model for 3D CT Images

The paper introduces MedLSAM, an automated framework for medical image segmentation, combining a novel 3D localization model, MedLAM, with the Segment Anything Model (SAM). This paper addresses the challenge of high annotation workloads in medical datasets by enabling automated segmentation of 3D CT scans.

Central to this framework is MedLAM, a foundation model designed for precise localization of anatomical structures in 3D medical images. It leverages two key self-supervised tasks: Unified Anatomical Mapping (UAM) and Multi-Scale Similarity (MSS), which project anatomical structures onto a shared latent space and refine localization using local feature similarity, respectively. Trained on 14,012 CT scans from 16 datasets, MedLAM achieves localization accuracy that challenges fully-supervised counterparts, demonstrating a significant reduction in annotation dependency.

MedLSAM evaluates its performance on two relevant datasets, showing that integrating MedLAM with SAM provides segmentation results closely aligning with those obtained through manual annotation. This integration enhances efficiency, particularly in head-and-neck organs where high anatomical consistency facilitates accurate segmentation, narrowing the effectiveness gap between automatic and manual prompt-driven methods.

The ablation studies highlight the synergy between UAM and MSS, reinforcing the model's robustness in diverse anatomical contexts. Furthermore, the introduction of Sub-Patch Localization (SPL) optimizes segmentation accuracy by dividing structures into smaller spatial segments, surpassing traditional bounding box methods.

MedLSAM's implications are broad, offering a substantial reduction in manual intervention for medical imaging tasks. This automated approach not only promises to streamline segmentation processes in clinical and research settings but also sets a precedent for extending such methodologies to other areas of imaging. Potential future integrations with 3D SAM models may further advance its performance, indicating a promising trajectory for refinement and application in complex imaging environments.

In conclusion, MedLSAM provides a compelling step toward fully automated medical image segmentation, aligning closely with the performance of existing models while offering a scalable solution for large-scale dataset annotation. The framework’s reliance on minimal manual inputs highlights its potential to significantly impact medical image analysis.

Github Logo Streamline Icon: https://streamlinehq.com