ISLES 2022 Stroke MRI Dataset
- ISLES 2022 dataset is a multi-center, expert-annotated MRI repository comprising 400 stroke scans for robust algorithm benchmarking.
- It integrates data from multiple centers, including training and test sets to capture heterogeneous imaging protocols and lesion characteristics.
- The annotation pipeline combines 3D-UNet pre-segmentation with expert manual corrections, achieving high median Dice scores and clinical relevance.
The ISLES 2022 dataset is a multi-center, expert-annotated magnetic resonance imaging (MRI) repository designed to advance algorithmic segmentation of ischemic stroke lesions. Serving as the basis for the 2022 Ischemic Stroke Lesion Segmentation (ISLES) challenge, it comprises 400 multi-vendor, heterogeneously-acquired acute-to-subacute stroke MRI scans. The resource supports robust, reproducible, and generalizable research on automated lesion segmentation, facilitating high-impact benchmarking across variable patient imaging presentations and institutional protocols (Petzsche et al., 2022, Rosa et al., 2024).
1. Dataset Composition and Structure
The dataset encompasses 400 MRI examinations acquired from adult patients (age ≥ 18 years) undergoing brain imaging due to acute or suspected ischemic stroke. The split is as follows:
- Training set: 250 cases (publicly released), drawn from Center #1 (TUM Munich, Philips Achieva & Ingenia 3 T) and Center #2 (University of Bern, Siemens Verio 3 T).
- Test set: 150 cases, partitioned equally across Centers #1, #2, and #3 (UKE Hamburg, Siemens Avanto & Aera 1.5 T). These data are withheld for challenge evaluation.
- To ensure protocol and lesion heterogeneity, the acquisition spans multiple vendors and MRI platforms. Data include acute (hyper-acute to 7 days) and early sub-acute (1–3 weeks) infarcts; five subjects without visible infarct increase variability.
A subset of 100 test scans reflects the post-thrombectomy sub-acute stage, while 50 are pre-thrombectomy early acute images to probe generalization to hyper-acute imaging (Rosa et al., 2024).
2. MRI Modalities and Acquisition Parameters
All cases provide:
- FLAIR (Fluid Attenuated Inversion Recovery),
- DWI (Diffusion Weighted Imaging, b = 1000 s/mm²),
- ADC (Apparent Diffusion Coefficient) maps.
Sequences were exported in NIfTI and packaged in a BIDS-compliant structure. The table below details core acquisition parameter ranges (Petzsche et al., 2022):
| Modality | TR (ms) | TE (ms) | TI (ms) | In-plane res. (mm²) | Slice thickness (mm) |
|---|---|---|---|---|---|
| FLAIR | 4,800–12,000 | 103–395 | 1,650–2,850 | 0.23×0.23–1×1 | 0.7–9.6 |
| DWI (b=1000) | 3,175–16,439 | 55–91 | – | 0.88×0.88–2×2 | 2.0–6.5 |
Preprocessing included conversion to NIfTI, skull-stripping for anonymization (HD-BET), and FLAIR-to-DWI registration (Elastix, rigid), with Center #1’s DWI/ADC resliced axially to isotropic 2×2 mm² voxels. No subsequent intensity normalization or spatial normalization was applied prior to release (Petzsche et al., 2022, Rosa et al., 2024).
3. Annotation Protocol and Ground Truth Definition
Annotation entailed a multi-step, hybrid workflow:
- Initial segmentation: 3D-UNet pre-segmentation, trained in-house on DWI.
- Correction: First-pass manual refinement by trained medical students.
- Review: Edits consolidated and further corrected by neuroradiology residents.
- Expert validation: Final review and approval by one of three attending neuroradiologists, each with >10 years of stroke imaging experience.
Manual editing utilized ITK-SNAP and 3D Slicer. Ground-truth segmentation masks are provided as binary NIfTI files, strictly naming-conventioned per BIDS norms (e.g., sub-001_ses-1_dwi_space-native_label-lesion.nii.gz). Disconnected lesions may be identified using standard 3D connected-component analysis. On a 10-scan subset, two independent neuroradiologists achieved median Dice coefficient 0.92 ± 0.16 and lesion-wise F1 = 0.82 ± 0.30 relative to the consensus reference (Petzsche et al., 2022, Rosa et al., 2024).
4. Data Organization, Access, and Licensing
Data are structured by subject/session per BIDS recommendations:
- Each subject’s folder contains
/anat/with FLAIR and/dwi/with DWI, ADC, and lesion masks. - Metadata sidecars in JSON retain DICOM-derived imaging parameters when available.
- Training data (n=250) are downloadable via Zenodo (CC BY 4.0). Test data (n=150, with labels withheld) are accessible for challenge evaluation only through the Grand Challenge server, with leaderboard updates post-evaluation.
- File sizes for each 3D sequence average 30–70 MB, depending on resolution and slice count (Petzsche et al., 2022, Rosa et al., 2024).
5. Evaluation Metrics and Official Challenge Procedures
Segmentation performance is evaluated on multiple complementary metrics, including:
- Dice Similarity Coefficient (DSC):
- Hausdorff Distance (HD):
- Jaccard Index, Sensitivity, Specificity, Precision, F1 Score, Volume Difference, and Lesion Count Difference
Challenge submissions were accepted only as dockerized algorithms; evaluation was performed centrally, with one submission permitted per team. Final rankings used a composite metric incorporating Dice, volume difference, lesion-wise F1, and lesion count differences. No official baseline segmentation or leaderboard statistics for the released test set were reported in the dataset publication (Petzsche et al., 2022, Rosa et al., 2024).
6. Scientific and Clinical Relevance
The ISLES 2022 dataset establishes a high-variability multimodal MR imaging benchmark for algorithmic validation in ischemic stroke segmentation. It supports algorithm development for acute/subacute infarct detection, volumetry, and evaluation of real-world robustness to scanner, protocol, and lesion heterogeneity. Notably, recent research demonstrates successful post-challenge translation: an ensemble of high-performing ISLES’22 challenge algorithms matched or exceeded expert radiologist performance in independent external validation, with clinical biomarker extraction (e.g., core lesion volume, correlation with NIHSS and 90-day mRS scores) on par with manual assessment (Rosa et al., 2024). This suggests that the dataset’s structure and rigor enable not just challenge-focused method development, but also clinically credible automated biomarker derivation.
7. Limitations, Metadata, and Extensibility
The dataset does not provide detailed patient-level demographics or clinical variables such as age, sex, NIHSS, or mRS for the 400 public cases. All distributed images retain original acquisition geometry, limiting harmonization but increasing realism for pipeline generalization. License terms (CC BY 4.0/CC BY-SA 4.0) promote broad reuse, but test data remain restricted to challenge use. A plausible implication is that although ISLES 2022 provides a robust benchmarking platform, algorithmic generalizability should be evaluated with caution relative to datasets with more extensive clinical metadata or external validation cohorts (Petzsche et al., 2022, Rosa et al., 2024).