Papers
Topics
Authors
Recent
2000 character limit reached

MU-Glioma-Post Dataset for Post-Treatment MRI Segmentation

Updated 15 December 2025
  • MU-Glioma-Post dataset is a comprehensive, multi-institutional MRI repository featuring 2,200 post-treatment glioma cases that serves as a benchmark for segmentation algorithms.
  • It employs a standardized preprocessing pipeline including DICOM-to-NIfTI conversion, skull-stripping, affine registration, and intensity normalization to ensure cross-site comparability.
  • Expert consensus annotations delineate four tumor sub-regions and utilize evaluation metrics like Dice Similarity Coefficient and Hausdorff Distance to assess segmentation performance.

The MU-Glioma-Post dataset is the foundational resource underlying the 2024 Brain Tumor Segmentation (BraTS) Challenge focused on automated segmentation of post-treatment glioma in magnetic resonance imaging (MRI). This dataset comprises 2,200 clinically acquired, multiparametric MRI examinations from seven major academic centers, representing the largest publicly accessible, expert-annotated cohort of post-treatment adult diffuse glioma cases to date. It serves both as a benchmarking platform for segmentation algorithms and as a research-grade dataset supporting investigation of tumor sub-region delineation after surgical and adjuvant therapies (Verdier et al., 28 May 2024).

1. Data Acquisition and Cohort Composition

The MU-Glioma-Post dataset aggregates MRI data from Duke University (680 cases), UCSF (600), University of Missouri Columbia (400), UCSD (350), Heidelberg University Hospital (300), University of Michigan (100), and Indiana University (70). All subjects are adults diagnosed with WHO grade II–IV diffuse glioma, spanning the full range from low- to high-grade entities, restricted to the post-treatment clinical context (defined as post-surgical resection, radiation, and/or systemic therapy). Imaging originates from both 1.5 T and 3 T MRI scanners from Siemens, GE, and Philips. The dataset encompasses a wide heterogeneity of post-treatment appearances, including varying extents of resection, infiltrative recurrence, and treatment-related changes.

Each case includes the following mpMRI modalities:

  • Pre-contrast T1-weighted (T1)
  • Contrast-enhanced T1-weighted (T1-Gd)
  • T2-weighted (T2)
  • T2-FLAIR
  • T1-Gd minus T1 subtraction images (provided for annotation assistance)

Typical in-plane resolution ranges from 0.5–1.0 mm × 0.5–1.0 mm, with slice thicknesses of 1–5 mm. Precise acquisition parameter distributions are site-dependent and not comprehensively enumerated (Verdier et al., 28 May 2024).

2. Preprocessing Pipeline and Data Standardization

A standardized preprocessing workflow is implemented to ensure cross-site comparability:

  1. DICOM-to-NIfTI conversion using dcm2niix.
  2. Automated skull-stripping employing HD-BET.
  3. Affine registration of all four sequences to the MNI symmetric atlas using CapTK/Greedy.
  4. Intensity standardization, performed either in accordance with prior BraTS pipelines or through site-specific normalization (including FeTS 2.0 normalization at certain centers).

This unified approach to spatial alignment and intensity normalization permits robust downstream model development and evaluation, despite the high degree of protocol heterogeneity inherent to multi-institutional imaging data.

3. Annotation Protocol and Segmentation Taxonomy

Annotations follow a rigorously defined protocol established by expert neuroradiologists and radiation oncologists. The workflow incorporates:

  • Initial pre-segmentations by five independent nnU-Net or SegResNet models.
  • STAPLE fusion to integrate pre-segmentations.
  • Iterative manual correction and consensus refinement by board-certified neuroradiologist approvers.
  • Two independent annotation-approver cycles for the hidden test set, enabling extraction of inter-rater reliability estimates (not reported in the dataset description).

Segmentation targets four mutually-exclusive, clinically relevant post-treatment sub-regions:

  • Enhancing Tissue (ET): Nodular or thick contrast enhancement on T1-Gd (excluding vessels or peri-cavity thin enhancement), with boundaries refined using T1-Gd–T1 subtraction.
  • Non-Enhancing Tumor Core (NETC): Non-enhancing necrotic/cystic regions appearing hypointense on T1/T1-Gd, excluding resection cavity.
  • Surrounding Non-Enhancing FLAIR Hyperintensity (SNFH): All T2/FLAIR hyperintensity post-treatment potentially representing edema, gliosis, or infiltrating tumor, not including chronic microvascular changes.
  • Resection Cavity (RC): Acute or chronic cavities (CSF-isointense on all modalities or containing blood/air/protein) corresponding to prior surgical resection.

All segmentation masks are provided as NIfTI .nii.gz volumes with integer-coded labels: 1 = ET, 2 = SNFH, 3 = NETC, 4 = RC.

4. Dataset Organization, Splitting, and Access

The data are organized following standard BraTS directory conventions. Each subject's imaging volumes are stored as:

Data Structure Modality/Label
/Training/Image_{CaseID}_{modality}.nii.gz T1, T1Gd, T2, FLAIR
/Training/Label_{CaseID}.nii.gz Segmentation mask (1–4)
/Validation/images/… Unlabeled for validation
/Testing/images/… Unlabeled for testing

Dataset partitions:

  • Training: ~70% (≈1,540 cases), with full ground-truth
  • Validation: ~10% (≈220 cases), no ground-truth released
  • Testing: ~20% (≈440 cases), fully blinded

Ground-truth annotations are distributed exclusively for training; validation and test label volumes are withheld to preserve the integrity of benchmark evaluations and challenge rankings. File nomenclature rigorously encodes CaseID and modality.

The dataset is publicly available through the Sage Bionetworks Synapse platform (https://www.synapse.org/#!Synapse:syn53708249/wiki/627500), subject to free registration and a standard data use agreement. The license permits academic and non-commercial research with attribution.

5. Evaluation Metrics and Statistical Notes

Primary evaluation metrics for automated segmentation are the lesion-wise Dice Similarity Coefficient (DSC) and the 95th percentile Hausdorff Distance (HD95_{95}) applied to each tumor sub-region:

  • DSC(P,G)=2PGP+G\mathrm{DSC}(P,G) = \frac{2\,\lvert P \cap G \rvert}{\lvert P\rvert + \lvert G\rvert}
  • HD95(P,G)=max{Percentile95({minyGd(x,y)}xP),Percentile95({minxPd(y,x)}yG)}\mathrm{HD}_{95}(P,G) = \max\Bigl\{\mathrm{Percentile}_{95}\bigl(\{\min_{y\in G} d(x,y)\}_{x\in P}\bigr), \mathrm{Percentile}_{95}\bigl(\{\min_{x\in P} d(y,x)\}_{y\in G}\bigr)\Bigr\}

These metrics are calculated in a lesion-wise manner, enhancing sensitivity to multifocal or fragmented pathologies common in the post-treatment context (Verdier et al., 28 May 2024). Cohort-wide summary statistics of intensities and lesion size distributions are generated internally for quality assessment but are not publicly tabulated.

6. Impact, Research Context, and Usage Scenarios

The MU-Glioma-Post dataset defines the current state-of-the-art for post-treatment glioma MRI segmentation benchmarking. Its combination of large-scale, multi-institutional acquisition, expert consensus segmentations, and standardization enables advances in clinical-grade segmentation model development and facilitates research into:

  • Tumor progression and patterns of recurrence post-treatment
  • Segmentation and quantification of surgically-altered and therapy-modified tissue
  • Methodological studies of algorithm robustness to protocol heterogeneity
  • Model generalization, transfer learning, and domain adaptation across sites

Additionally, the precise sub-region taxonomy and rigorous annotation protocol address well-known shortcomings of pre-treatment segmentation benchmarks, supporting the translation of automated models toward integration in routine clinical assessment and trial endpoints. Access via open data-use agreement further lowers barriers to entry for global collaborative efforts.

[Editor's term]: "mpMRI" refers to multiparametric MRI (T1, T1-Gd, T2, FLAIR).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to MU-Glioma-Post Dataset.