Papers
Topics
Authors
Recent
2000 character limit reached

SAGE-UNet: Adaptive Segmentation Architecture

Updated 30 November 2025
  • SAGE-UNet is an advanced, adaptive architecture that combines static CNN–Transformer backbones with dynamic, sparsely gated expert routing for precise medical segmentation.
  • It incorporates a Shape-Adapting Hub to harmonize heterogeneous modules, enabling local-global reasoning while reducing redundant computations.
  • The model achieves state-of-the-art Dice scores in colonoscopic and histopathological tasks, with dynamic gating yielding significant performance gains over conventional approaches.

SAGE-UNet is an input-adaptive, dynamically routed neural architecture for medical image segmentation, particularly targeting the challenges of cellular heterogeneity in whole slide images (WSIs) and colonoscopic lesion analysis. It operationalizes the Shape-Adapting Gated Experts (SAGE) framework, converting a static CNN–Transformer hybrid backbone (e.g., U-Net) into a sparsely gated mixture-of-experts model. SAGE-UNet features a dual-path design with hierarchical gating and a Shape-Adapting Hub (SA-Hub) for harmonizing architectural diversity between CNN and Transformer modules. Its adaptive computation paradigm reduces redundancy, enables local-global reasoning, and achieves state-of-the-art (SOTA) segmentation accuracy across multiple medical benchmarks (Thai et al., 23 Nov 2025).

1. SAGE-UNet Architecture and Dynamic Routing

SAGE-UNet generalizes traditional UNet architectures by introducing two parallel computational streams at every network layer:

  • Main Path: Preserves the operations of the original pretrained backbone, ensuring representational continuity.
  • Expert Path: Selectively activates a sparse set (Top-KK) of experts—either shared or domain-specialized—using a multi-level gating mechanism.

Let zi1RH×W×Cz_{i-1} \in \mathbb{R}^{H\times W\times C} denote the input feature map to the ii-th layer. The main path computes zi(main)=fi(zi1)z_i^{(\mathrm{main})} = f_i(z_{i-1}), while the expert path extracts a global embedding zˉi1\bar z_{i-1} and applies two gating stages:

  • Shared Expert Gate: Computes gs=σ(zˉi1Wgate(i)+bgate(i))g_s = \sigma(\bar z_{i-1} W_{\mathrm{gate}}^{(i)} + b_{\mathrm{gate}}^{(i)}), where σ\sigma is a logistic sigmoid, to distribute probability mass between shared and fine-grained experts.
  • Semantic Affinity Routing (SAR): For each candidate expert jj, computes Li,jL_{i,j} using query-key similarity and adaptive noise, then composes Li,jL'_{i,j} by augmenting scores with log(gs)\log(g_s) (for shared) or log(1gs)\log(1-g_s) (for specialized).

The model selects the Top-KK experts per layer, and their normalized gated outputs form the expert-path output:

zi(expert)=jIwjz^i(j)z_i^{(\mathrm{expert})} = \sum_{j\in\mathcal I} w_j\,\hat z_i^{(j)}

where I\mathcal I denotes the indices of the KK most relevant experts.

A learnable scalar αi\alpha_i fuses the main and expert paths:

zi=αizi(main)+(1αi)zi(expert)z_i = \alpha_i\,z_i^{(\mathrm{main})} + (1-\alpha_i)\,z_i^{(\mathrm{expert})}

This design enables dynamic, input-dependent routing and adaptive capacity allocation.

2. Shape-Adapting Hub (SA-Hub) and Heterogeneous Expert Integration

The SA-Hub mediates architectural mismatch between the CNN and Transformer experts:

  • Input Adapter SinS_{\mathrm{in}}: Transforms a 2D CNN feature map into the target format (e.g., token sequence) required by the expert module.
  • Output Adapter SoutS_{\mathrm{out}}: Projects the expert's output back into the CNN-style spatial and channel dimensions, enabling seamless path fusion.

These adapters allow shared use of heterogeneous expert backbones (e.g., ConvNeXt layers, ViT transformer blocks) within the same stage, achieving both semantic consistency and architectural interoperability.

3. Architectural Specification and Implementation

SAGE-UNet is instantiated atop the TransUNet architecture augmented by SAGE mechanisms. Specific configuration parameters are:

  • Expert Pool: Total of M=20M=20 experts per layer (4 shared, 16 fine-grained).
  • Expert Selection: Top-K=4K=4 experts activated per layer.
  • Expert Injection: Experts are integrated at every encoder and decoder block, following the stage structure of ConvNeXt (4 stages) and ViT (16 transformer blocks).
  • Gating and Routing: Hierarchical logit modulation (Eq. (5)–(7)), with expert selection affecting final decoder logits.

A summary of the modular design:

Component Function Associated Model Elements
Main Path Preserves original backbone operations fi,zi(main)f_i, z_i^{(\mathrm{main})}
Expert Path Dynamic, Top-KK expert selection gs,Li,j,wj,zi(expert)g_s, L_{i,j}, w_j, z_i^{(\mathrm{expert})}
SA-Hub Adapts between CNN and Transformer Sin,SoutS_\mathrm{in}, S_\mathrm{out}
Dual-Path Fusion Learns balance αi\alpha_i ziz_i

4. Training Procedure and Loss Functions

SAGE-UNet is trained end-to-end with a composite loss function per minibatch:

Ltotal=λCELCE(P,Y)+λDiceLDice(P,Y)+λlbi=1Tj=1Mfj(i)Pj(i)\mathcal L_{\mathrm{total}} = \lambda_{\mathrm{CE}} \mathcal L_{\mathrm{CE}}(\mathbf P,\mathbf Y) + \lambda_{\mathrm{Dice}} \mathcal L_{\mathrm{Dice}}(\mathbf P,\mathbf Y) + \lambda_{\mathrm{lb}} \sum_{i=1}^T\sum_{j=1}^M f_j^{(i)} P_j^{(i)}

where:

  • LCE\mathcal L_{\mathrm{CE}} is the cross-entropy loss,
  • LDice\mathcal L_{\mathrm{Dice}} is the Dice loss,
  • Llb\mathcal L_{\mathrm{lb}} is the load-balancing loss, encouraging even utilization of the expert pool.

Hyperparameters are set as: λCE=1\lambda_{\mathrm{CE}}=1, λDice=1.5\lambda_{\mathrm{Dice}}=1.5, λlb=1\lambda_{\mathrm{lb}}=1. Optimization uses AdamW with a two-stage learning rate schedule.

5. Quantitative Results and Ablation Studies

SAGE-UNet achieves new SOTA Dice scores across multiple colorectal histopathology datasets:

Dataset (subset) Dice Score (%) Dice Gain over Baseline (%)
EBHI (Adenocarcinoma) 95.57 +3.3
DigestPath (colon patch) 95.16 +1.7
GlaS (A+B) 94.17 +2.65 (approx)

These results surpass ConvNeXt-UNet, SegFormer, and EViT-UNet benchmarks. Ablations demonstrate:

  • Sigmoid (vs. softmax) gating increases EBHI Dice from 95.05% to 95.57%.
  • Increasing Top-KK from 1→4 yields a +5.4% Dice improvement.
  • Scaling shared experts from 1→4 adds +0.47% Dice.

6. Domain Generalization Performance

On GlaS Test B (designed to assess domain shift), SAGE-UNet achieves 94.67% Dice, surpassing EViT-UNet by +1.4% and UNet++ by +2.74%. Qualitative evaluations indicate robust boundary delineation under morphological shifts. Shared gating scalars gsg_s suggest CNN stages favor shared experts (gs0.5g_s\gg0.5) while Transformer stages operate near gs0.5g_s\approx0.5, evidencing context-dependent dispatch that may underlie improved adaptability (Thai et al., 23 Nov 2025).

7. Significance, Limitations, and Future Directions

SAGE-UNet demonstrates several empirical and architectural advantages:

  • Adaptive Computation: Reduces redundant expert evaluation for simple regions, concentrating capacity on complex instances.
  • Hierarchical Expert Routing: Enables flexible balance between general and specialized processing, with improved interpretability and segmentation accuracy.
  • Heterogeneous Module Fusion: SA-Hub allows seamless collaboration between architectures tailored for distinct representational granularities.

Identified limitations include increased implementation complexity and slower per-layer inference due to multi-expert evaluation, despite overall sparsity. The optimal design of the expert pool (depth and width) remains open; automated architecture search represents a plausible avenue for improving the performance-efficiency trade-off. Extension to 3D or multi-modal medical data is identified as a promising future direction (Thai et al., 23 Nov 2025).

SAGE-UNet establishes dynamic expert routing with shape-adapting fusion as a scalable and accurate paradigm for complex medical segmentation challenges.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to SAGE-UNet.