BraTS-Africa Dataset

Updated 7 October 2025

BraTS-Africa is a curated multi-parametric MRI dataset from Sub-Saharan Africa featuring expert glioma segmentation labels to address regional diagnostic disparities.
The dataset includes four MRI modalities and detailed annotations for enhancing tumor, non-enhancing tumor core, and surrounding hyperintensity, highlighting imaging heterogeneity and clinical challenges.
Advanced machine learning strategies, including ensemble, transformer-based, and parameter-efficient fine-tuning methods, are benchmarked on BraTS-Africa to enhance segmentation performance in resource-limited settings.

The BraTS-Africa dataset is a curated multi-institutional collection of pre-operative multi-parametric MRI (mpMRI) scans and expert-annotated glioma segmentations, acquired from clinical centers across Sub-Saharan Africa (SSA). Developed to address the pronounced healthcare disparities in brain tumor diagnosis and treatment between high-resource and low-resource regions, BraTS-Africa provides a critical substrate for evaluating and developing robust, generalizable computer-aided diagnostic (CAD) and segmentation methodologies tailored for resource-limited environments. It is integral to the MICCAI Brain Tumor Segmentation (BraTS) challenge series, explicitly enabling the benchmarking of modern ML and deep learning (DL) models on African MRI data characterized by lower resolution, scanner heterogeneity, and unique disease phenotypes.

1. Dataset Composition and Imaging Characteristics

BraTS-Africa consists of a retrospective cohort of adult pre-operative glioma cases, sourced from multiple SSA imaging centers. The data is partitioned into standard training, validation, and testing cohorts. Each case comprises four mpMRI modalities: T1-weighted, post-contrast T1-weighted (T1Gd), T2-weighted, and T2-FLAIR. Expert radiologists supply voxel-wise segmentation labels for three tumor subregions:

Enhancing Tumor (ET)
Non-Enhancing Tumor Core (NETC)
Surrounding Non-Enhancing FLAIR Hyperintensity (SNFH)

A salient attribute of BraTS-Africa is its inherent imaging heterogeneity: images are acquired on hospital equipment with variable protocols and frequently exhibit reduced contrast, resolution, and increased artefact burdens relative to Global North datasets. The patient population is further distinguished by frequent late-stage presentation and higher rates of radiologically suspected gliomatosis cerebri (Adewole et al., 2023). Preprocessing pipelines standardize the data via NIfTI conversion, SRI24 template-based co-registration, resampling to isotropic 1 mm³ voxels, and skull stripping.

2. Challenges Specific to Sub-Saharan African Settings

The SSA context introduces unique obstacles for segmentation algorithmic development:

Systematic late presentation of glioma cases limits the diversity of early lesions.
Lower-quality MRI technology, prevalent in SSA, produces scans with decreased spatial fidelity and signal-to-noise ratio.
Heterogeneous demographics and suspected atypical tumor phenotypes further complicate model generalization.

These factors exacerbate the difficulty of tumor subregion visualization, segmentation, and clinical interpretability compared to contemporary datasets (Adewole et al., 2023, Amod et al., 2023, Adhikari et al., 2024). Such limitations necessitate bespoke ML methods that can robustly extract actionable features from suboptimal imaging and maintain high performance even under distribution shifts.

3. Machine Learning Methodologies Benchmarking BraTS-Africa

BraTS-Africa has catalyzed the development and evaluation of state-of-the-art segmentation models, including both convolutional and transformer-inspired networks. Key approaches documented include:

Ensemble models combining CNNs such as DeepSeg, nnU-Net, and DeepSCAN, demonstrating notable generalization to SSA data (mean Dice: 0.9737 WT, 0.9593 TC, 0.9022 ET; HD95: 2.66 mm, 1.72 mm, 3.32 mm) (Zeineldin et al., 2022).
Ensemble frameworks using UNet3D, V-Net, MSA-VNet, with aggregation via the STAPLE algorithm, leading to Dice scores of 0.8521 (WT), 0.8358 (TC), 0.8167 (ET) (Fadugba et al., 4 Feb 2025).
Transformer-augmented models (MedNeXt, SwinUNETR-v2), leveraging pretraining on larger BraTS-GLI datasets and fine-tuning with SSA samples (Adhikari et al., 2024, Jaheen et al., 31 Jul 2025, Musah et al., 29 Jul 2025).

Parameter-efficient fine-tuning (PEFT) and adaptive postprocessing pipeline innovations have achieved competitive performance (mean Dice: 0.80 with PEFT, 0.77 with full fine-tuning), highlighting advantages in computation-restricted settings (Adhikari et al., 2024).

Fine-tuning strategies consistently outperform both local-only training and direct data-mixing, reflecting the necessity of domain adaptation for practical success in SSA imaging (Amod et al., 2023, Maani et al., 2024, Musah et al., 29 Jul 2025).

4. Evaluation Metrics and Methodological Best Practices

Performance on BraTS-Africa is typically quantified via:

Dice Similarity Coefficient (DSC), $\text{DSC} = \frac{2 |A \cap B|}{|A| + |B|}$ , where $A$ and $B$ are ground truth and prediction sets (Adewole et al., 2023).
Hausdorff Distance (95th percentile, HD95), $HD(A,B) = \max\Big\{ \sup_{a \in A}\inf_{b \in B} d(a,b), \sup_{b \in B}\inf_{a \in A} d(a,b)\Big\}$ , utilized as HD95 to mitigate outlier effects (Adewole et al., 2023).
Lesion-wise metrics: lesion-wise Dice (LSD) and Normalized Surface Dice (NSD) at multiple tolerances (e.g., 0.5 mm, 1.0 mm) are used to assess both volumetric and boundary fidelity (Jaheen et al., 31 Jul 2025, Ankomah et al., 3 Oct 2025).

Best practices, supported by cross-validation and stratified fold creation (e.g., radiomic feature-based clustering (Parida et al., 2024)), ensure reliable assessment and reduce overfitting, especially critical when working with limited SSA-case numbers.

5. Impact on Global Neuro-Oncology Research and Clinical Care

The public availability of BraTS-Africa has enabled the field to rigorously interrogate how segmentation algorithms transfer to domains marked by scan heterogeneity, annotation scarcity, and demographic diversity. Its integration with challenges such as BraTS-Lighthouse, BraTS-GoAT, and campaigns funded by the Lacuna Fund has promoted the development of adaptive, resource-efficient AI solutions.

High-performing strategies (e.g., transfer learning from large-scale Global North datasets followed by fine-tuning and adaptive augmentation (Maani et al., 2024, Zhao et al., 2024, Chepchirchir et al., 7 Jan 2025)) show promise for improving diagnostic accuracy and treatment planning in SSA, potentially narrowing healthcare outcome disparities. Ensemble and boundary-aware pipelines, as exemplified by EMedNeXt, capitalize on contextual information and robust architectural skeletons to achieve lesion-wise DSC of ≈0.897 and NSD up to 0.84 (Jaheen et al., 31 Jul 2025).

A plausible implication is that future extensions—such as federated learning for privacy-aware collaboration across SSA centers, more aggressive artifact-simulation augmentation, and local radiomic stratification—may further boost model reliability and foster sustainable clinical deployment in low-resource regions.

6. Data Augmentation, Fine-Tuning, and Ensemble Strategies

Given the limited sample size and domain shift in BraTS-Africa, segmentation-aware data augmentation (segmentation-mask guided elastic deformation, anatomical-preserving transforms (Ankomah et al., 3 Oct 2025)) and advanced ensembling (MedNeXt, SegMamba, Residual-Encoder U-Net (Ankomah et al., 3 Oct 2025)) drive improvements in model generalization. Neural style transfer augmentation, where low-quality SSA MRIs are in-painted with stylistic cues from high-quality images (Chepchirchir et al., 7 Jan 2025), has shown efficacy in reducing empty mask artifacts in SSA validation.

Ensembling strategies, especially STAPLE and weighted model combination, are demonstrated to yield more balanced segmentations across tumor subregions, smoothing inter-model prediction variability (Fadugba et al., 4 Feb 2025, Parida et al., 2024).

7. Limitations, Controversies, and Directions for Expansion

Analyses indicate that naive mixing of disparate datasets (SSA + Global North) without targeted adaptation may reduce performance on SSA validation cohorts, due to annotation style divergence and protocol heterogeneity (Musah et al., 29 Jul 2025, Amod et al., 2023). Novel models such as transformer-based SegMamba and adaptation mechanisms (e.g., convolutional adapters for PEFT (Adhikari et al., 2024)) offer promising computational trade-offs, but may still show sensitivity/specificity imbalances (e.g., high specificity but lower sensitivity in small regions).

The explicit documentation and open-source dissemination of code and pipelines (e.g., via MedPerf, nnU-Net, EMedNeXt repositories) reinforce reproducibility and provide templates for further research.

Ongoing recommendations include expanded acquisition protocols, more granular annotation standards adapted for SSA practice, and systematic investigations into local domain adaptation methods. These efforts are poised to strengthen equitable clinical translation and foster methodological rigor in future segmentation studies focused on underrepresented populations.