Papers
Topics
Authors
Recent
2000 character limit reached

Multisite Breast DCE-MRI Dataset

Updated 28 November 2025
  • The dataset is a comprehensive resource of dynamic contrast-enhanced MRIs collected from multiple international sites, offering diverse imaging conditions and protocols.
  • It integrates detailed expert annotations, harmonized clinical metadata, and heterogeneous acquisition parameters to support tumor segmentation and treatment response prediction.
  • Robust evaluation metrics, including Dice coefficient and normalized Hausdorff Distance, are used to benchmark segmentation and classification performance across varied clinical scenarios.

A multisite breast DCE-MRI dataset is a compiled resource of dynamic contrast-enhanced magnetic resonance imaging from multiple clinical centers, designed to enable robust development, validation, and benchmarking of machine learning and radiomics models for breast cancer imaging. These datasets integrate pre-treatment T1-weighted DCE-MRI from various international trials or institutions, combining imaging, expert annotations, and harmonized clinical metadata to address tumor segmentation, treatment response prediction, and other clinically relevant tasks. Among such resources, the MAMA-MIA “Multisite Breast DCE-MRI Dataset” has become the current publicly available benchmark for large-scale, high-quality, expert-annotated datasets in this domain (Garrucho et al., 19 Jun 2024).

1. Cohort Composition and Site Diversity

The MAMA-MIA dataset aggregates 1,506 pretreatment DCE-MRI studies from four major @@@@3@@@@ collections: ISPY1 (n=171), ISPY2 (n=980), Duke (n=291), and NACT-Pilot (n=64). These source collections span nine or more hospitals and capture heterogeneity in acquisition hardware and protocols representative of real-world clinical practice. The full dataset retains the native distribution of centers and protocols, with no additional balancing or synthetic reweighting applied (Garrucho et al., 19 Jun 2024).

Participants include women undergoing neoadjuvant chemotherapy for biopsy-confirmed breast cancer, with inclusion criteria centered on the availability of pre-treatment T1-weighted DCE-MRI and expert-annotated segmentations of the index tumor. The patient-level metadata span demographic, clinical, and imaging-acquisition variables; however, some papers leveraging the dataset (e.g., (Musah, 3 Aug 2025)) do not further stratify results by site, age, or molecular subtype, while others reference harmonized clinical variables available in the master metadata table (Garrucho et al., 19 Jun 2024).

2. Acquisition Protocols and Imaging Parameters

Heterogeneity in image acquisition is a key characteristic and challenge of multisite DCE-MRI datasets. Among MAMA-MIA centers, SIEMENS, GE, and Philips scanners are represented, with field strengths spanning 1.5 T (72%) and 3 T (28%). Acquisition planes vary (ISPY1/NACT sagittal, ISPY2/Duke axial). Up to 11 dynamic phases per paper are available, with mean numbers of phases: 3 (ISPY1), 7 (ISPY2), 4 (Duke), 3 (NACT).

Table of core acquisition details (from master dataset release (Garrucho et al., 19 Jun 2024)):

Parameter ISPY1 ISPY2 Duke NACT Overall
Acquisition plane Sagittal Axial Axial Sagittal
Field strength 1.5 T (100%) 1.5/3 T 1.5/3 T 1.5 T (100%) 1.5/3 T
Mean # phases [min,max] 3 [3,6] 7 [4,11] 4 [3,6] 3 [3,7] 6 [3,11]
Mean # slices [min,max] 64 [44,256] 106 [52,256] 169 [60,256] 60 [46,64] 111 [44,256]
Slice thickness (mm) 2.4 [1.5,4.0] 2.0 [0.8,3.0] 1.1 [1.0,2.5] 2.0 [2.0,2.4] 1.9 [0.8,4.0]
Pixel spacing (mm) 0.8 [0.4,1.2] 0.7 [0.3,1.4] 0.7 [0.5,1.3] 0.7 [0.4,0.9] 0.7 [0.3,1.4]

Specific parameters such as TR, TE, flip angle, injection rates, and coil type are not uniformly reported in summary tables but are available in original DICOM headers. No harmonization of contrast protocols was enforced; agent, dose, and rate vary by trial (Garrucho et al., 19 Jun 2024, Wang et al., 20 Nov 2025). This suggests that model robustness to protocol variance is needed and is a focus of recent methodological studies (Wang et al., 20 Nov 2025).

3. Annotation Protocols and Ground Truth

Primary tumor and non-mass enhancement regions were annotated by a panel of 16 experts from nine centers, with an average of 9 years’ experience. The workflow involved initial nnU-Net-based automated segmentations within a 3D volume-of-interest (VOI), which were then manually corrected by the experts using Mango (v4.1). Inclusion criteria for annotation focused on the index lesion (mass/NME), explicitly excluding nodes, clips, and benign regions. Each annotator processed approximately 70 cases, with guidelines and consensus criteria enforced for consistency (Garrucho et al., 19 Jun 2024).

No formal inter-reader variability metrics (e.g., Dice/ICC) are provided for the final masks, but two radiologists separately rated the quality of automatic masks, and standard distance metrics such as Dice coefficient and 95th-percentile Hausdorff Distance were used (Garrucho et al., 19 Jun 2024). Protocols for segmentation coverage, exclusion/inclusion, and VOI extent are documented in the dataset release.

4. Data Preprocessing and Harmonization

Standardization steps include:

  • Resampling all studies to isotropic 1×1×1 mm³ using appropriate interpolation.
  • Consistent orientation mapping (sagittal: PSR; axial: LAS).
  • Cropping VOIs around the tumor region.
  • Z-score intensity normalization over all phases for model input.

No explicit bias-field correction (e.g., N4) or advanced site-harmonization such as ComBat is applied in the master release (Garrucho et al., 19 Jun 2024). For model development, preprocessing frequently includes 3D patch extraction (e.g., 128×128×128 voxels), random flips, and spatial augmentations (Musah, 3 Aug 2025). A plausible implication is that further harmonization research could mitigate remaining domain shifts.

5. Evaluation Metrics and Benchmarking Procedures

Segmentation and classification are evaluated using well-defined, widely accepted quantitative metrics:

  • Dice coefficient:

Dice(P,G)=2PGP+G\mathrm{Dice}(P, G) = \frac{2|P \cap G|}{|P| + |G|}

where PP is the prediction and GG is the ground truth (Musah, 3 Aug 2025).

  • Normalized Hausdorff Distance (NormHD):

NormHD(P,G)=1Dmaxmax{suppPinfgGd(p,g),supgGinfpPd(g,p)}\mathrm{NormHD}(P, G) = \frac{1}{D_{\max}} \max \left\{ \sup_{p \in P} \inf_{g \in G} d(p, g), \sup_{g \in G} \inf_{p \in P} d(g, p) \right\}

where d(,)d(\cdot,\cdot) is Euclidean distance and DmaxD_{\max} is the image diagonal (Musah, 3 Aug 2025).

  • Balanced Accuracy (for pCR classification):

BalancedAccuracy=12(TPTP+FN+TNTN+FP)\mathrm{BalancedAccuracy} = \frac{1}{2}\left( \frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}} + \frac{\mathrm{TN}}{\mathrm{TN}+\mathrm{FP}} \right)

(Musah, 3 Aug 2025).

A baseline 3D nnU-Net trained with 5-fold cross-validation achieves mean Dice scores of 0.7620±0.2113 for tumor segmentation (Garrucho et al., 19 Jun 2024). More recent methods applying large-kernel MedNeXt models report mean Dice of 0.67 (ensemble) and normalized HD of 0.24 in challenge-style validation (Musah, 3 Aug 2025). For pCR classification, balanced accuracy averaged 57%, with subgroup performance up to 75% (Musah, 3 Aug 2025). No studies report fixed, public splits beyond cross-validation and live leaderboard evaluation.

6. Clinical and Imaging Metadata

The dataset includes a harmonized table of 49 variables per patient, spanning:

  • Demographic variables: age, ethnicity, BMI, breast implants, bilateral/multifocal cancer.
  • Clinical variables: tumor subtype, ER/PR/HER2 status, T/N staging, treatment and outcome metrics.
  • Imaging-acquisition variables: scanner manufacturer, field strength, matrix size, number of phases, timing of phases, slice thickness, pixel spacing, orientation, and protocol identifiers (Garrucho et al., 19 Jun 2024).

A subset of studies and model evaluations stratify performance by clinical features such as age or breast density, but many modeling reports focus on aggregate results.

7. Limitations, Heterogeneity, and Applications

Heterogeneity in scanner hardware, acquisition protocol, and dynamic sequence timing is intrinsic to the multisite design. This heterogeneity complicates model robustness and domain generalization—justifying techniques such as explicit modeling of phase acquisition times (e.g., via FiLM layers (Wang et al., 20 Nov 2025)) and in-depth harmonization (Garrucho et al., 19 Jun 2024, Wang et al., 20 Nov 2025).

Annotation protocols enforce stringent inclusion but inter-reader agreement is not explicitly quantified. No pharmacokinetic mapping (e.g., Tofts parameter images) is included, though all DICOM headers and raw dynamic series support such secondary analyses. The primary use cases are:

  • Benchmarking of tumor segmentation algorithms.
  • Training and evaluation of radiomics or deep learning models for pCR and survival prediction.
  • Cross-domain transfer learning, harmonization studies, and protocol sensitivity analyses.
  • Quality control and generative data synthesis (Garrucho et al., 19 Jun 2024, Musah, 3 Aug 2025).

References

  • "A large-scale multicenter breast cancer DCE-MRI benchmark dataset with expert segmentations" (Garrucho et al., 19 Jun 2024)
  • "Large Kernel MedNeXt for Breast Tumor Segmentation and Self-Normalizing Network for pCR Classification in Magnetic Resonance Images" (Musah, 3 Aug 2025)
  • "Acquisition Time-Informed Breast Tumor Segmentation from Dynamic Contrast-Enhanced MRI" (Wang et al., 20 Nov 2025)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Multisite Breast DCE-MRI Dataset.