BraTS 2020 Brain Tumor Segmentation Dataset

Updated 29 January 2026

The dataset is a benchmark resource with fully annotated mpMRI scans from 369 training subjects, standardizing multi-modal images for glioma segmentation.
Preprocessing includes coregistration, skull stripping, and intensity normalization to support consistent performance evaluation using DSC and HD95 metrics.
State-of-the-art methods leverage BraTS 2020 to advance segmentation strategies, improve generalizability, and drive reproducible research across institutions.

The BraTS 2020 Brain Tumor Segmentation Dataset is a benchmark resource for the evaluation and development of automated algorithms for glioma segmentation in multi-parametric magnetic resonance imaging (mpMRI). This dataset forms the backbone of the 2020 iteration of the Multimodal Brain Tumor Segmentation Challenge (BraTS 2020), providing rigorously pre-processed MRI data, multi-institutional coverage, and authoritative ground-truth annotations for algorithmic development, validation, and comparison across the research community.

1. Dataset Composition and Preprocessing

BraTS 2020 comprises a collection of 369 fully annotated training subjects, each with four co-registered MRI modalities: T1-weighted (T1), contrast-enhanced T1 (T1ce or T1Gd), T2-weighted (T2), and fluid-attenuated inversion recovery (FLAIR). All scans have been rigidly aligned to a common anatomical template, skull-stripped, and resampled to isotropic 1×1×1 mm³ voxels. Input images from different centers, institutions, and acquisition protocols are standardized so that no further resampling is required by challenge participants. Volumetric data are provided in a spatial matrix of typically 240×240×155 voxels, although extracted 2D patches (e.g., 120×120) or 3D patches (e.g., 128³ voxels) are commonly used during model training and inference (Silva et al., 2021, Cirillo et al., 2020, Yuan, 2020).

The dataset is split into:

Set	Number of cases	Annotation availability	Purpose
Training	369	Ground-truth provided	Model development, cross-validation
Validation	125	Hidden (evaluation server)	Hyperparameter tuning, leaderboard
Test	166	Hidden	Final ranking, challenge evaluation

Preprocessing includes per-volume or per-channel intensity normalization to zero mean and unit variance, with background (non-brain) voxels often set to zero. BraTS 2020 organizers preapply coregistration and skull stripping, so algorithm designers generally preserve the provided geometry and intensity scaling.

2. Labeling and Annotation Protocols

BraTS 2020 provides manual voxel-level segmentations approved by expert neuroradiologists. Each tumor case is labeled for the following subregions:

Whole Tumor (WT): Encompasses peritumoral edema (ED), necrotic/non-enhancing tumor core (NCR/NET), and Gadolinium-enhancing tumor (ET).
Tumor Core (TC): Comprises ET and NCR/NET.
Enhancing Tumor (ET): Gadolinium-enhancing regions only.

Voxel annotations are encoded as integer or one-hot labels, with values typically: 0 (background), 1 (ED), 2 (NCR/NET), and 4 (ET). Competition metrics derive from these standard definitions, enabling fine-grained performance evaluation for each biological compartment (Silva et al., 2021, Cirillo et al., 2020, Yuan, 2020).

3. Segmentation Challenge and Evaluation Metrics

BraTS 2020 focuses its primary task on the automated segmentation of the aforementioned tumor subregions from the provided multi-channel mpMRI scans. The evaluation employs two quantitative metrics computed separately for WT, TC, and ET:

Dice Similarity Coefficient (DSC):

$\mathrm{DSC}(X, Y) = \frac{2|X \cap Y|}{|X| + |Y|}$

95th Percentile Hausdorff Distance (HD_{95}):

$HD_{95}(X, Y) = \max \left\{ h_{95}(X, Y),\, h_{95}(Y, X) \right\}$

where $h_{95}(A,B)$ is the 95th-percentile minimum Euclidean distance from points in $A$ to $B$ . This dual-metric structure quantifies both volumetric overlap (DSC) and boundary fidelity (HD_{95}) (Silva et al., 2021, Yuan, 2020).

The challenge infrastructure maintains a blinded evaluation protocol for the validation and test sets, preventing direct overfitting and supporting fair model assessment.

4. Utilization in Segmentation Architectures

Multiple state-of-the-art deep learning architectures have leveraged BraTS 2020 for segmentation research:

Deep Layer Aggregation Networks (DLA-FCN Cascade): Utilize cascaded 2D FCNs with iterative and hierarchical layer aggregation, Gaussian-filter-guided downsampling, and multi-stage auxiliary loss supervision. Each stage inputs both MRI channels and upstream features/probabilities, enabling systematic coarse-to-fine refinement. Mean Dice scores on the 2020 test set can exceed 0.88 for WT segmentation, with moderate Hausdorff distances (e.g., 5.32 mm for WT) (Silva et al., 2021).
3D GAN-based Models (Vox2Vox): Employ volumetric U-Net architectures as GAN generators, integrated with instance normalization, residual bottlenecks, and PatchGAN discriminators. GAN-based realism regularization through adversarial loss and generalized Dice loss, cross-validated ensembling, and small-structure postprocessing strategies yield competitive performance across all regions, e.g., Dice scores of 87.20% (WT), 81.14% (TC), and 78.67% (ET) (Cirillo et al., 2020).
Scale Attention Networks (SA-Net): Encoder-decoder models integrating dynamic scale attention layers to fuse multi-scale features and ResSE blocks to enhance spatial and channel feature interaction. Patch-based training, deep supervision, and 11-model ensembling result in mean Dice scores of 0.8828 (WT), 0.8433 (TC), and 0.8177 (ET); the method ranked third among 693 challenge submissions (Yuan, 2020).

All methods employ data augmentation including random rotations, intensity perturbation, spatial flips, and at times elastic deformations to maximize robustness to inter-institutional MRI variability.

5. Standardization, Variability, and Generalization

The multi-institution, multi-protocol origin of the BraTS 2020 dataset necessitates robust intensity normalization and harmonization. Pre-standardized imaging geometry and skull stripping allow network architectures to focus on signal and texture consistency rather than domain adaptation complexities. The inclusion of various acquisition protocols is widely credited with improving algorithmic generalizability to unseen clinical and research MRI data (Cirillo et al., 2020).

A plausible implication is that successful segmentation models trained on BraTS 2020 are less prone to overfitting single-center idiosyncrasies than those trained on homogeneous datasets.

6. Limitations and Recommendations for Future Research

Despite preprocessing and standardized annotation, notable challenges persist in BraTS 2020 usage:

Inter-case variability: Large standard deviations and outlier HD_{95} values, particularly for ET and TC, suggest model failure on rare or atypical cases, especially small or irregularly shaped lesions.
Label imbalance: The pronounced skew in subregion volumes (e.g., small ET clusters) motivates specialized loss functions (boundary/focal loss) and sampling strategies.
2D vs 3D contextualization: While some pipelines slice images axially, full volumetric (3D) context may be required for optimal delineation of connected tumor components (Silva et al., 2021).

Recommended future directions include explicit handling of label imbalance, more refined ensembling and postprocessing to mitigate rare catastrophic errors, and leveraging BraTS’s additional meta-data for integrated segmentation/outcome models.

7. Impact and Benchmark Status

The BraTS 2020 dataset serves as a gold standard for brain tumor segmentation algorithm benchmarking due to its expert-curated ground-truth, multimodal data schema, and rigorously blinded evaluation setup. It underpins advances in network architectures from DLA and scale attention to volumetric GANs, supporting meaningful progress in clinical neuroradiology and computational medical imaging. The dataset’s structure and competitive challenge framework drive both reproducibility and state-of-the-art performance reporting across the field (Silva et al., 2021, Cirillo et al., 2020, Yuan, 2020).