Oblique Floodwater UAV Dataset

Updated 23 January 2026

Oblique Floodwater Dataset is a curated high-resolution UAV video collection offering dense flood segmentation masks and per-frame metadata.
It comprises 30 video sequences with up to half a million frames captured at oblique angles, supporting adaptive, low-latency segmentation on edge devices.
Validated with metrics like IoU ≥ 0.92, the dataset benchmarks rapid flood mapping under diverse hydrometeorological conditions.

The Oblique Floodwater Dataset is a curated collection of high-resolution oblique UAV video specifically designed for the task of flood extent mapping under real-world disaster response constraints. It provides dense pixel-level flood segmentation masks and rich per-frame metadata for approximately half a million frames captured across diverse hydrometeorological scenarios in Flanders, Belgium. The dataset is released under a permissive license with detailed annotations and is intended to facilitate development and benchmarking of segmentation models—especially those optimized for embedded, low-latency edge inference and adaptive processing in spatiotemporally redundant video streams (Sharma et al., 16 Jan 2026).

1. Dataset Composition and Characteristics

The core of the Oblique Floodwater Dataset consists of 30 distinct UAV video sequences captured during flood events, each ranging from 2 to 20 minutes in duration. All video was recorded at frame rates of 25–30 fps, yielding 3,000–36,000 frames per sequence, with an aggregate of approximately 500,000–550,000 frames.

Frame spatial resolutions vary from 1280×720 (720p) to 3840×2160 (4K), and all frames are provided at their original resolution. For model training, down-sampling to resolutions such as 1024×576 or 2048×1152 pixels is typically recommended. The dataset covers a broad array of environmental and lighting conditions—including clear, overcast, and stormy weather, times of day spanning morning to evening, and seasonal variation from spring to fall within the same region.

Each image is annotated with a binary mask distinguishing flooded areas (water) from non-flooded regions (land, vegetation, infrastructure), with per-frame water coverage ranging from ~10% to ~80% (median ≈ 30%). No official train/val/test split is provided; however, a 70/15/15% split by video sequence is recommended to ensure independent events and representative water coverage in each subset (Sharma et al., 16 Jan 2026).

2. Data Acquisition Methodology

Consumer-grade quadcopters (e.g., DJI Phantom/Mavic series) equipped with 1″–1/2.3″ CMOS RGB sensors (FOV of 80°–95°) were used for acquisition. The cameras were mounted at oblique angles between 30° and 70° from nadir, with flight altitudes spanning 30–120 meters above ground level. Data were collected via a combination of forward-motion passes and hover-plus-yaw slow pans. Video was captured in H.264 format (MP4) at bitrates of 50–100 Mbps.

Pre-processing involved decoding video to raw RGB frames, applying lens-distortion correction, and performing color normalization. Additionally, 16×16-pixel patches are extracted from each image for patch-based processing workflows. Each frame is accompanied by detailed metadata, including GPS coordinates, altitude, camera attitude (pitch, yaw, roll), UTC timestamp, and source filename (Sharma et al., 16 Jan 2026).

3. Annotation Protocol

The annotation schematic follows a semi-automated pipeline leveraging Segment Anything Model 2 (SAM2):

Expert annotators provide positive (water) and negative (non-water) point prompts on 5–10 keyframes per sequence.
SAM2 propagates these prompts across the video, generating dense, per-frame masks.
Manual spot checks are performed on approximately 10% of frames to intercept and correct class drift or label error.

Each frame’s annotation is a single-channel PNG: 0 (non-flood), 1 (flood). Metadata is stored as a per-frame JSON bundle. Quality control metrics indicate that on keyframes, human/SAM2 mask agreement achieves a mean IoU ≥ 0.92, with overall pseudo-label precision/recall exceeding 95% on spot-checked samples (Sharma et al., 16 Jan 2026). No finer-grained class ontology is provided: the focus is strictly water extent delineation versus non-water.

4. Licensing, Access, and Usage Recommendations

The Oblique Floodwater Dataset is distributed under the Creative Commons Attribution 4.0 International license (CC BY 4.0). Direct download and code are available at https://github.com/decide-ugent/floodwater-dataset.

No official split is enforced. The recommended practice is a stratified 70/15/15% split by sequence, ensuring that each split contains at least one unique event with a variance in water coverage ratios. Python/PyTorch code is provided to facilitate access and dataset loading:

import os
from PIL import Image
import torch

class FloodwaterDataset(torch.utils.data.Dataset):
    def __init__(self, img_dir, mask_dir, transform=None):
        self.images = sorted(os.listdir(img_dir))
        self.masks  = sorted(os.listdir(mask_dir))
        self.img_dir = img_dir
        self.mask_dir = mask_dir
        self.transform = transform

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        img_path  = os.path.join(self.img_dir,  self.images[idx])
        mask_path = os.path.join(self.mask_dir, self.masks[idx])
        image = Image.open(img_path).convert("RGB")
        mask  = Image.open(mask_path).convert("L")
        if self.transform:
            image, mask = self.transform(image, mask)
        return torch.Tensor(image), torch.LongTensor(mask)

Citation is recommended as: Sharma V., Leroux S., Landuyt L., Witvrouwen N., Simoens P. (2025). Oblique Floodwater Dataset. In "Efficient On-Board Processing of Oblique UAV Video for Rapid Flood Extent Mapping," IEEE TGRS (Sharma et al., 16 Jan 2026).

5. Evaluation Metrics and Benchmark Performance

Key metrics for model evaluation on the Oblique Floodwater Dataset include:

Water coverage per frame:

$\text{water\_coverage} = \frac{\text{flooded\_pixels}}{\text{total\_pixels}}$

Intersection over Union (IoU) for flood class:

$\mathrm{IoU} = \frac{|\text{Prediction} \cap \text{GroundTruth}|}{|\text{Prediction} \cup \text{GroundTruth}|}$

Mean IoU (mIoU) over both classes:

$\mathrm{mIoU} = \frac{\mathrm{IoU}_\text{flood} + \mathrm{IoU}_\text{non-flood}}{2}$

Benchmark segmentation (EfficientNet-B4 backbone) achieves:

Dense baseline: 78.0% mIoU, 88.0% pixel accuracy, 30 FPS on NVIDIA 1080 Ti, 15 FPS on NVIDIA Jetson Orin Nano.
With Temporal Token Reuse (TTR): 77.6% mIoU (−0.4% absolute), 87.8% pixel accuracy, 50 FPS (1080 Ti), 25 FPS (Orin Nano) (Sharma et al., 16 Jan 2026).

These results, with mIoU degradation under 0.5% and a 30% reduction in segmentation latency, validate the dataset’s role in time-critical, edge-deployed flood mapping system development.

6. Comparative Context and Applications

The Oblique Floodwater Dataset is distinctive in its provision of real oblique, high-fidelity UAV video, pixelwise flood masks, and per-frame metadata. In comparison, AIFloodSense (Simantiris et al., 19 Dec 2025) aggregates 470 still images from 230 global flood events, classifying camera orientation via a binary "sky presence" attribute as a proxy for oblique/nadir distinction. AIFloodSense targets multiclass segmentation, camera-angle classification, and VQA, and includes broader environmental and geographic diversity, but lacks dense video and fine-grained temporal annotation.

FloodNet (Rahnemoonfar et al., 2020), another relevant resource, consists of 3,200 high-resolution UAV stills, with multi-class pixel labels allowing for nuanced post-flood understanding (e.g., flooded roads/buildings vs. natural water), but offers less temporal density and a more complex class ontology. FloodNet’s imagery is typically 10–30° off-nadir; this is less oblique than the 30–70° pitch range in the Oblique Floodwater Dataset.

A plausible implication is that the Oblique Floodwater Dataset’s focus and fine temporal granularity better support adaptive video-processing pipelines (e.g., real-time segmentation with TTR), while AIFloodSense and FloodNet address complementary research questions, such as domain generalization, multi-class scene parsing, and high-level reasoning through VQA.

7. Significance and Research Utility

The Oblique Floodwater Dataset furnishes the remote-sensing community with a benchmark tailored for challenging, real-world scenarios: rapid, high-resolution, per-frame flood mapping under strict UAV SWaP (Size, Weight, and Power) constraints. Its combination of automated feature propagation in annotation, robust quality controls, and rich georeferenced metadata aligns directly with the operational needs of both research and applied hydrological monitoring. The absence of an official split enables custom evaluation design, including event-level generalization and sensitivity to varying water coverage scenarios.

Its architecture and protocol make it suitable not only for dense segmentation benchmarking but also as a platform for the development of temporally adaptive and on-board inference algorithms, as exemplified by the Temporal Token Reuse methodology (Sharma et al., 16 Jan 2026). The dataset is positioned to enable advances in edge AI for disaster response, advancing beyond prior still-image collections in temporal scope and application realism.