AtlasSegFM: One-Shot Segmentation Adaptation

Updated 27 December 2025

AtlasSegFM is a one-shot segmentation framework that leverages a single annotated atlas to adapt pretrained models to new clinical scenarios.
It fuses deformable atlas registration for prompt generation with a test-time learnable fusion adapter that refines segmentation outputs.
Experimental evaluations show significant Dice score improvements, especially for small or underrepresented structures across diverse imaging modalities.

AtlasSegFM is a one-shot customization framework designed to adapt segmentation foundation models to novel clinical contexts utilizing a single annotated atlas. By fusing global anatomical priors from atlas registration with the refinement capabilities of pretrained segmentation foundation models, AtlasSegFM achieves robust, generalizable performance across diverse medical imaging modalities and anatomical targets, especially excelling on small or underrepresented structures. Central to AtlasSegFM are: (1) context-aware prompt generation via deformable atlas registration, and (2) a test-time, learnable fusion adapter that combines atlas and model predictions. These components are trained per-case, with the core segmentation model remaining entirely frozen, requiring only one annotated support image for adaptation. The framework is validated across public and in-house datasets and integrates seamlessly into existing clinical inference pipelines (Zhang et al., 20 Dec 2025).

1. Registration-Driven Contextual Prompt Generation

AtlasSegFM initiates adaptation through a two-stage test-time registration aligning a pre-segmented atlas $(X_\mathrm{atlas}, Y_\mathrm{atlas})$ to a new query volume $X_\mathrm{query}$ . The registration seeks a spatial transform $T: \mathbb{R}^3 \to \mathbb{R}^3$ minimizing

$T^\star = \arg\min_T \mathcal{D}(X_\mathrm{atlas} \circ T, X_\mathrm{query}) + \lambda \mathcal{R}(T),$

where $\mathcal{D}$ denotes similarity (SSD, NCC, or MI) and $\mathcal{R}(T)$ enforces transformation smoothness. Practically, registration comprises rigid and affine steps, followed by a test-time optimized VoxelMorph-style U-Net, converging within ≈1.5 min on RTX4090 hardware.

Once $T^\star$ is computed, atlas labels are warped to the query ( $M_\mathrm{atlas} = Y_\mathrm{atlas} \circ T^\star$ ), delivering a coarse, globally consistent segmentation. This mask supplies predefined prompt types to the downstream foundation model: click-prompts (centroid of largest component), box-prompts (minimal axis-aligned bounding box), or full mask-prompts, as dictated by the prompt preference of specific foundation models (such as nnInteractive or MedSAM2).

2. Test-Time Fusion Adapter: Architecture and Training

Segmentation outputs from the foundation model $f_\mathrm{FM}$ using atlas-derived prompts ( $M_\mathrm{fm} = f_\mathrm{FM}(X_\mathrm{query}, \mathrm{Prompt}_i)$ ) may lack satisfactory global context or miss fine details. AtlasSegFM introduces a lightweight fusion module employing a Kalman-filter-style update: $M_\mathrm{final} = (1 - K) M_\mathrm{fm} + K\,M_\mathrm{atlas},$ with gain field $K = \sigma(g([M_\mathrm{atlas}, M_\mathrm{fm}]))$ , predicted by a small 3D network $g$ acting on the concatenated atlas and model probability maps. $g$ consists of parallel 3D max-pool paths (kernels 3,5,7), channel-wise fusion, several 3D convolutions, and a $1 \times 1$ convolution, with a sigmoid activation to ensure $K(x) \in [0,1]$ per voxel.

The fusion adapter $g$ is trained at test time (using the single available support atlas and label) to minimize supervised Dice loss: $\mathcal{L}_\mathrm{fuse} = 1 - \frac{2 \sum_x M_\mathrm{final}(x)\, Y_\mathrm{support}(x)}{\sum_x M_\mathrm{final}(x) + \sum_x Y_\mathrm{support}(x)}.$ This optimization (≈0.3 min, ≈0.1M parameters) leaves the backbone $f_\mathrm{FM}$ frozen.

3. One-Shot Customization Dynamics

AtlasSegFM’s adaptation pipeline is uniquely defined by per-query, per-context test-time optimization on a single annotated atlas. The operational sequence is:

a) Optimize the registration network on $(X_\mathrm{atlas}, X_\mathrm{query})$ for $\mathcal{D} + \lambda \mathcal{R}$ ;

b) Generate compatible prompts from the warped atlas labels $M_\mathrm{atlas}$ ;

c) Run the frozen foundation model $f_\mathrm{FM}$ to obtain $M_\mathrm{fm}$ ;

d) Tune only the fusion network $g$ via Dice loss using the support label $Y_\mathrm{support}$ .

No further backbone fine-tuning or multi-shot data are required; both test-time modules (registration, fusion) are derived from the provided single atlas. The “one-shot” mechanism enables customization to new tasks with minimal annotation or retraining overhead.

4. Experimental Evaluation Across Datasets and Tasks

AtlasSegFM underwent evaluation on six datasets, covering CT and MRI, and encompassing both large organs and small, intricate structures:

Dataset	Imaging Modality	Key Structures
Abd-CT	CT	Liver, kidneys, spleen
Abd-MR	MRI (T2-SPIR)	Liver, kidneys, spleen
AVT	CTA	Aortic vessel tree
Fe-MRA	MRA	12 arteries/veins
OASIS	MRI	Whole brain
BrainRT	CT/MRI	Brain, organs-at-risk

Reported metrics include Dice, Hausdorff 95, Normalized Surface Dice (NSD), and clDice (continuity-aware for vessels).

Key findings:

AtlasSegFM outperformed foundation models without customization (nnInteractive, vesselFM), 2D slice-wise ICL baselines (UniverSeg, Tyche, Iris), and supervised nnU-Net (on small data).
On Abd-CT, mean Dice increased from ~67% (best ICL) to 72.9% (+5.4%); on BrainRT “organs-at-risk,” from 39% to 77.1%.
Pre-registration improved atlas Dice from ~44% to 70% (Abd-MR).
Prompting $f_\mathrm{FM}$ with $M_\mathrm{atlas}$ raised Dice by 17% versus no prompt.
Adaptive fusion contributed an additional 12% Dice enhancement.
Gains were most evident for small, fine structures (e.g., optic nerves, thin vessels).

5. Computational Efficiency and Deployment Considerations

Only two compact test-time modules are learned: registration (≈1.3M parameters, 5-layer U-Net) and fusion adapter (≈0.1M). The foundation model backbone remains entirely frozen.

Empirical runtime (for a $256^3$ volume, RTX4090):

Registration: ≈1.5 min
Foundation model inference: ≈0.01 min
Fusion adaptation: ≈0.3 min
Total: ≈1.8 min per query

Peak GPU memory usage (~10 GB) is driven by the 3D U-Nets. Integration is straightforward, requiring only a single support atlas per new segmentation context. No offline retraining or multi-shot fine-tuning is necessary, enabling direct deployment in clinical radiology and radiotherapy workflows.

6. Implications and Comparative Context

AtlasSegFM advances a paradigm where global anatomical context—instantiated via registration and transformed prompts—supplements the locality and prompt-dependence of current segmentation foundation models. This architecture particularly addresses shortcomings in contexts underrepresented in foundation model pretraining, as well as the limitations of precise prompting for challenging anatomies.

By decoupling test-time adaptation from backbone fine-tuning and reducing support requirements to a single annotated atlas, AtlasSegFM is positioned as a lightweight, flexible solution for real-world deployment, facilitating rapid customization for novel clinical tasks with minimal annotation cost (Zhang et al., 20 Dec 2025).

PDF Markdown Chat (Pro)

References (1)

Atlas is Your Perfect Context: One-Shot Customization for Generalizable Foundational Medical Image Segmentation (2025)

AtlasSegFM: One-Shot Segmentation Adaptation

1. Registration-Driven Contextual Prompt Generation

2. Test-Time Fusion Adapter: Architecture and Training

3. One-Shot Customization Dynamics

4. Experimental Evaluation Across Datasets and Tasks

5. Computational Efficiency and Deployment Considerations

6. Implications and Comparative Context

Whiteboard

Follow Topic

Continue Learning

AtlasSegFM: One-Shot Segmentation Adaptation

1. Registration-Driven Contextual Prompt Generation

2. Test-Time Fusion Adapter: Architecture and Training

3. One-Shot Customization Dynamics

4. Experimental Evaluation Across Datasets and Tasks

5. Computational Efficiency and Deployment Considerations

6. Implications and Comparative Context

Sponsor

Whiteboard

Follow Topic

Continue Learning

Related Topics