EMedNeXt: Advanced Brain Tumor Segmentation
- EMedNeXt is an advanced brain tumor segmentation framework integrating MedNeXt V2 with deep supervision and ensembling to address low-field SSA MRI challenges.
- The method automates preprocessing, utilizes an expanded ROI, and applies hierarchical post-processing to overcome limited scan volume and annotation scarcity.
- EMedNeXt achieves a lesion-wise DSC of 0.897 and robust NSD scores, marking significant improvements over previous baselines in clinical-grade glioma segmentation.
EMedNeXt is an advanced brain tumor segmentation framework developed for robust operation on challenging sub-Saharan African (SSA) magnetic resonance imaging (MRI) datasets, which are characterized by low-field acquisition, limited scan volume, and marked heterogeneity. Designed for the BraTS-Lighthouse 2025 Challenge, EMedNeXt is anchored on the MedNeXt V2 backbone, incorporates nnU-Net v2–inspired architectural automation, leverages deep supervision for enhanced training signals, and employs optimized model ensembling and post-processing pipelines. The method addresses small dataset sizes, variable scanner quality, and annotation scarcity central to clinical settings in low-resource regions. On the hidden SSA validation set, EMedNeXt achieves a lesion-wise Dice similarity coefficient (DSC) of 0.897 and normalized surface Dice (NSD) scores of 0.541 (0.5 mm tolerance) and 0.84 (1.0 mm tolerance), demonstrating significant performance gains over previous baselines (Jaheen et al., 31 Jul 2025).
1. Network Architecture and Design Principles
EMedNeXt adopts a 3D U-Net–style encoder–decoder structure underpinned by ConvNeXt-inspired blocks and builds on the fully automated nnU-Net v2 configuration paradigm. The core architectural features include:
- Input Handling: Four MRI modalities (FLAIR, T1, T1ce, T2) and an optional "foreground" channel are harmonized into a region of interest (ROI) of voxels—an explicit enlargement over the prior 128³ input, facilitating contextual modeling of large or multi-focal SSA tumors.
- Encoder Path: This consists of four stages, each implemented as a MedNeXt V2 block: depthwise convolution, pointwise (1×1×1) channel expansion, Instance Normalization (IN), GELU activation, and further pointwise projection. Down-sampling is realized via strided convolution or pooling.
- Bottleneck: The deepest layer utilizes the same MedNeXt V2 block, sans further resolution change.
- Decoder Path: Mirrored to the encoder, it incorporates up-sampling (via transposed convolution), skip-concatenation with encoder features, and MedNeXt V2 blocks.
- Deep Supervision: Three auxiliary segmentation heads generate intermediate outputs by tapping decoder features at progressively coarser levels, in addition to the final full-resolution prediction.
- Mathematical Operations: The architecture explicitly defines key operations as follows:
- 3D convolution:
- Depthwise-separable block:
- Instance Normalization:
The enlarged ROI specifically improves sensitivity to extensive and multifocal disease phenotypes prevalent in the SSA dataset.
2. Supervision and Training Objective
Training is governed by a composite deep supervision loss, fostering stable gradient flow even with annotation scarcity.
- Hybrid Loss: Each output (main and auxiliary) uses a Dice–Focal hybrid loss augmented with a boundary-aware penalty term:
Deep Supervision Weighting: Auxiliary losses are weighted by powers of two () to prioritize finer resolutions.
Total Loss:
This approach enhances model trainability, particularly when decoder-focused fine-tuning is needed due to limited SSA labels.
3. Data Pipeline and Training Configuration
EMedNeXt is designed for scarcity and domain shift, with aggressive data preprocessing and calibrated training regimens.
Datasets: Pre-training utilizes 1,195 PPTAG high-quality 4-modality glioma scans and 60 SSA cases for training, with 219 PPTAG and 35 SSA validation examples.
Preprocessing Sequence:
1. Denoising and negative outlier clipping, resetting extreme values to zero. 2. Channel-wise z-score normalization on nonzero voxels. 3. Cubic resampling to isotropic 1.0 mm, cropped/padded to . 4. Multi-channel stacking: [FLAIR, T1, T1ce, T2, aggregated_foreground_mask]. 5. Output in parallelized NumPy formats.
Augmentation: Minimal online augmentation; robust coverage arises from merging PPTAG and SSA, with inference 7-way test-time augmentation (flips).
Training Regimes:
- Pre-training: AdamW (schedule-free), encoder/decoder/heads learning rates (0, 1), 50 epochs, mixed precision, batch size 3.
- Fine-tuning: Only decoder/head layers updated (or optional deepest encoder blocks), 50 epochs.
- Full-training: For final leaderboard, no freezing; AdamW, learning rate 2, weight decay 3, 150 epochs.
4. Model Ensembling and Post-Processing
EMedNeXt’s output is further refined by an ensemble and hierarchical post-processing scheme optimized for resource-limited inference.
Model Ensembling:
- Five top checkpoints from fine-tuned MedNeXt V2 models ("Base=3") are ensembled via uniform-weighted probability fusion:
4
- Inference is two-pass: model-level sliding windows (50% overlap, 7-way flips), then normalization pooling.
Post-Processing Pipeline:
- Thresholding: Class-thresholds set as 5, 6 or 7, 8 or 9.
- Connected Component Pruning: For each class, retain the largest connected components meeting 0 constraints.
- Hierarchical Enforcement: Enforce 1 containment; propagate submasks and re-prune.
- Final Label Fusion: Priority order 2 yields a single 3 class map.
Parameter Values:
| Class | Min CC Size 4 | Min Mean Prob 5 |
|---|---|---|
| TC | 150 | 0.1 |
| WT | 500 | 0.1 |
| ET | 100 | 0.1 |
If more than 10 CCs in a class survive, the 10 largest are kept.
5. Performance Evaluation
EMedNeXt was benchmarked on the BraTS-Lighthouse 2025 hidden SSA validation set using the following metrics:
- Dice Similarity Coefficient (DSC)
6
- Normalized Surface Dice (NSD) (tolerance 7)
8
| Model | ET-DSC | TC-DSC | WT-DSC | ET-NSD9 | TC-NSD0 | WT-NSD1 | NSD2 |
|---|---|---|---|---|---|---|---|
| Baseline MedNeXt V1 | 0.822 | 0.815 | 0.881 | 0.424 | 0.378 | 0.383 | 0.728 |
| MedNeXt V2 | 0.845 | 0.860 | 0.914 | 0.501 | 0.470 | 0.444 | 0.796 |
| FT MedNeXt V2 | 0.870 | 0.863 | 0.919 | 0.570 | 0.513 | 0.499 | 0.821 |
| Ensemble + PP | 0.883 | 0.873 | 0.933 | 0.580 | 0.522 | 0.521 | 0.839 |
Relative to baseline MedNeXt V1, EMedNeXt achieves a 3 mean DSC and 4 NSD5 improvement. The improvements are consistent across tumor subtypes (ET, TC, WT), and, while formal p-values are not reported, repeated gains across three classes indicate robust statistical significance (Jaheen et al., 31 Jul 2025).
6. Adaptations to SSA Data and Deployment Implications
EMedNeXt incorporates multiple design elements targeting SSA-specific resource and data limitations:
- Scanner Variability and Artifacts: Aggressive denoising, per-channel z-score normalization, and expansion to a larger ROI suppress inter-scanner signal variations. Merging high-quality PPTAG data broadens source distribution for the encoder, improving adaptability.
- Annotation Scarcity: A two-phase pre-train/fine-tune regimen, with encoder freezing, mitigates overfitting risks; deep supervision enhances decoder trainability on sparse SSA annotations.
- Computational Constraints: Sliding-window, memory-efficient ensembling, and mixed-precision training ensure tractable requirements (batch=3) even on midrange hardware (four NVIDIA A6000s).
- Deployment Strategies: Integration with MONAI enables standardized pre/inference pipelines; deployment exposes threshold/connected-component controls in graphical interfaces, permitting tuning for local noise. Single-model checkpoints can serve as fallbacks on limited compute, with ensemble benefits saturating after two to three models.
A plausible implication is that such adaptations form a blueprint for deploying state-of-the-art segmentation in similarly constrained settings globally.
7. Summary and Future Considerations
EMedNeXt, by unifying an enlarged ROI, nnU-Net v2–guided MedNeXt V2 backbone, aggressive normalization, deep supervision, lightweight decoder adaptive training, and hierarchical post-processing, demonstrates readiness for deployment in low-resource SSA environments. It sets a new baseline for clinical-grade glioma quantification in MRI with an average lesion-wise DSC of 0.897 and robust NSD metrics. The modular design—supporting both ensemble and single-model inference—enables flexibility in clinical translation as hardware and data conditions evolve (Jaheen et al., 31 Jul 2025).