xBD Dataset: Disaster Damage Assessment

Updated 29 August 2025

xBD dataset is a large-scale, annotated satellite imagery resource that supports automated building damage assessment and change detection across 19 natural disaster events.
It utilizes rigorous annotation protocols and an ordinal damage scale, grading damage from 'no damage' to 'destroyed' for over 850,000 building annotations.
The integration of environmental context and multi-temporal imagery in xBD facilitates the development of advanced models for rapid, real-world disaster response.

The xBD dataset is a large-scale, high-resolution satellite imagery resource designed specifically for building damage assessment and change detection in the context of natural disasters. Developed for humanitarian assistance and disaster recovery, xBD integrates extensive multi-temporal annotated data, a rigorously defined damage scale, and contextual environmental features to enable automated algorithms for rapid post-disaster situational awareness.

1. Dataset Composition and Annotation Protocols

xBD consists of both pre-disaster and post-disaster satellite images covering 19 distinct natural hazard events (including hurricanes, earthquakes, floods, wildfires, and tsunamis), with a total area of 45,362 km². The dataset contains 22,068 images and 850,736 building annotations. Each pre-disaster image is annotated with building polygons—precise footprints drawn on the pre-event imagery. Post-disaster images overlay these polygons and are assessed according to the Joint Damage Scale, an ordinal scheme ranging from 0 (no damage) to 3 (destroyed). This scale, refined in collaboration with agencies such as CAL FIRE, the California Air National Guard, and FEMA, supports consistent damage grading over a heterogeneous set of events.

Annotations also include bounding boxes and labels for environmental contextual features—fire, flood water, smoke, and volcanic flows—providing additional metadata crucial for advanced modeling. The dataset structure adheres to reproducible protocols, splitting images into training, testing, and holdout sets (approximately 80%/10%/10%), ensuring robust evaluation for benchmarking in challenges such as xView2.

2. Data Sources and Collaborative Framework

xBD imagery is primarily sourced from the Maxar/DigitalGlobe Open Data Program, supplying three-band RGB images for various sudden onset disasters. In cases where events were not immediately covered by open data releases (“Tier 3” events), supplementary imagery was gathered through partnerships with Maxar and the National Geospatial-Intelligence Agency. The design and refinement of damage annotation scales leveraged expertise from leading disaster response organizations, ensuring operational relevance and alignment with real-world assessment criteria.

The spatial resolution and satellite metadata associated with each image are curated to ensure suitability for fine-grained detection and classification tasks, positioning xBD as the most extensive labeled resource for building damage analytics in global disaster informatics.

3. Ordinal Damage Scale and Loss Function Design

Damage assessment in xBD uses an ordinal classification scheme (0: no damage, 1: minor damage, 2: major damage, 3: destroyed). Algorithms developed on xBD often employ specialized loss functions that reflect the ordinal nature of labels. For example, the ordinal cross-entropy loss penalizes predictions based on the “distance” from the true label, ensuring that a misclassification from ‘major’ to ‘minor’ is less penalized than from ‘major’ to ‘no damage’. In practice, this enables models to learn nuanced gradations of damage rather than coarse binary splits.

The canonical form of such a loss in the baseline classification model is:

$L = -\sum_{i=1}^N w_i \cdot \log(p_i)$

where $N$ is the number of classes, $p_i$ is the predicted probability for class $i$ , and $w_i$ is the class-specific weight inversely proportional to the frequency of each class, addressing severe class imbalance.

4. Environmental Context and Ancillary Labels

xBD explicitly annotates environmental phenomena—fire, flood water, smoke, and volcanic flows—using bounding boxes and categorical labels. These annotations are critical for models that aim to distinguish among causes of damage, supporting more accurate attribution (e.g., differentiating between flood-induced and fire-induced destruction in multi-hazard settings). The ability to correlate building damage states with surrounding environmental features enriches the dataset for advanced multimodal modeling and facilitates broader research on event attribution and multi-source data fusion in disaster impact analysis.

5. Algorithmic Integration and Baseline Models

xBD has catalyzed the development of sophisticated computer vision pipelines for damage assessment and change detection. The dataset supports:

Building localization via an altered U-Net architecture adapted from SpaceNet, achieving high pixelwise accuracy for footprint extraction.
Joint damage classification models using pre-trained ResNet50 backbones fused with shallow CNNs, supporting ordinal and categorical damage grading.
Training and evaluation protocols standardize results for the xView2 challenge, ensuring reproducibility and methodological transparency.

The models on xBD explicitly incorporate environmental context, class weighting, and loss designs tailored to extreme class imbalance. This enables robust learning across the full damage spectrum (“no damage” to “destroyed”), overcoming labeling ambiguity and leveraging the rich contextual metadata.

6. Scale, Diversity, and Comparative Advantages

With 850,736 annotated buildings, 45,362 km² coverage, and representation across 19 global disasters, xBD remains the largest and most diverse benchmark for building damage assessment. The breadth of annotated data—both in terms of geographic agency (multiple countries, disaster types, and urban morphologies) and multimodal feature labeling (building polygons, environmental cues)—provides unparalleled opportunities for generalizable model development and cross-event benchmarking.

The dataset’s structure, scale, and annotation rigor establish it as the primary resource for evaluating algorithms in real-world post-disaster scenarios, supporting both academic research and rapid operational deployment.

7. Impact, Limitations, and Future Directions

xBD enables remote, automated damage assessment, reducing reliance on hazardous and time-consuming field surveys and supporting more efficient resource allocation in humanitarian crises. Improvements in model architecture tailored for xBD—such as advances in change detection, multimodal fusion, and ordinal loss design—have demonstrated operational potential for real-time disaster response.

Challenges remain in capturing finer gradations of damage, generalizing to new disaster types, and integrating additional remote sensing modalities (e.g., multi-spectral, SAR). Future directions include expanding xBD to more diverse events, refining annotation granularity (e.g., for partial roof or facade collapse), and incorporating further contextual data streams to enhance both the accuracy and interpretability of automated disaster analytics.

In summary, xBD sets the benchmark for building damage detection and change analytics from satellite imagery. Its compositional depth, collaborative foundation, and integration into the technological development pipeline have established it as an indispensable resource for advancing the state of the art in disaster response, damage grading, and humanitarian operations.

PDF Markdown Chat (Pro)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to xBD Dataset.