Papers
Topics
Authors
Recent
2000 character limit reached

EDS Dataset Overview

Updated 9 December 2025
  • EDS Dataset is a collection of three distinct resources designed for event-based segmentation, visual odometry, and imbalanced regression applications.
  • It employs advanced sensor calibration, precise temporal synchronization, and robust labeling techniques to ensure accurate experimental evaluations.
  • The datasets incorporate error distribution smoothing and representative subsampling to optimize training efficiency and model performance.

The acronym “EDS Dataset” applies to three distinct resources in recent literature: (1) the Event-based Segmentation Dataset (ESD) for 3D object segmentation in cluttered indoor environments (Huang et al., 2023); (2) the dataset released with Event-aided Direct Sparse Odometry (EDS) for event+frame visual odometry (Hidalgo-Carrió et al., 2022); and (3) datasets constructed for Error Distribution Smoothing (EDS) in imbalanced regression (Chen et al., 4 Feb 2025). Each targets different methodological domains—event-based vision, odometry, and regression/dataset balancing, respectively. Below, each dataset’s design, structure, and usage are presented in detail.

Sensor and Acquisition Setup

The ESD dataset is designed as a benchmark for spatiotemporal object segmentation using neuromorphic (event-based) and conventional RGB-depth sensing. The acquisition rig comprises two DAVIS346C event cameras (346×260, ≈120 dB dynamic range, microsecond latency) mounted left and right of a central Intel RealSense D435 RGB-D camera (RGB: 1920×1080, depth: 1280×720, ≈60 dB dynamic range). The cameras are tilted inward by 5°, with a 0.14 m stereo baseline, mounted ≈0.82 m above the tabletop. Calibration is performed via OpenCV to estimate intrinsics KK (and KeK_e for each DAVIS) and extrinsics [RcwtcwR_{cw}\mid t_{cw}] for each device. 3D-to-2D projections enable precise re-mapping of RGB/depth information onto event images. Temporal synchronization is ensured: DAVIS events at μs resolution and D435 frames at 60 Hz, timestamped in ROS and windowed at Δt ≈ 16.7 ms for alignment.

Dataset Composition and Structure

The full ESD corpus comprises 145 sequences (115 train, 30 test), totaling 14,166 manually annotated RGB frames. Event streams contain 21.88 M (left) and 20.80 M (right) events, respectively. Each sequence varies along controlled scene axes: object count (2–10), robot trajectory (linear, rotational, combined), velocity (0.15–1 m/s), lighting (normal, low), camera height (high/low), and occlusion. An organized directory tree contains calibration, per-sequence event and RGB folders, per-frame annotation masks (PNG: indexed per object), and depth maps (16-bit, millimeter scale). Metadata (.json) encodes sequence-level attributes.

Annotation and Depth Labeling

RGB instance masks are manually delineated (CVAT polygon tool). Occluded/blurred frames are either extrapolated by geometric prediction or validated against event accumulation images. Events are labeled automatically: events are batched by frame timestamp, RGB masks are reprojected to event coordinates using precise transformations, and a rigid 2D ICP aligns edge sets to events, inheriting instance labels. Depth for events is interpolated from the closest D435 frame; no refinement/learned enhancement is applied for release.

Usage, Preprocessing, and Benchmarks

A standard PyTorch pipeline enables sequence-wise loading and batching; events can be accumulated as frames by event count/time window. Events are normalized by polarity, resized, and optionally augmented by crop/flip (temporal order preserved). Performance metrics include pixel accuracy (AccAcc) and mean IoU (mIoUmIoU), following standard definitions: Acc=1Niδ(pi,p^i),mIoU=1CjTPjTPj+FPj+FNjAcc = \frac{1}{N} \sum_i \delta(p_i, \hat{p}_i), \quad mIoU = \frac{1}{C} \sum_j \frac{TP_j}{TP_j + FP_j + FN_j} Benchmark results demonstrate: modest RGB-only segmentation performance (mIoU up to 68.77% on known objects; <44% on unknown), extremely poor raw event-only transfer (max 8.92% mIoU), and strong cross-modal fusion (CMX achieves mIoU 94.58% on known, but only 18.90% on unseen objects). This exposes substantial unsolved gaps in event-based and generalization segments. Full tools and calibration scripts are publicly available.

Sensing Modalities and Calibration

This dataset provides time-aligned, co-located event camera/frame/IMU measurements with precise ground-truth pose for monocular event-based odometry research. The hardware comprises a Prophesee Gen3.1 event camera (640×480, ≥120 dB, ≤3 μs latency) and a FLIR Blackfly S camera (640×480, up to 75 Hz), both optically boresighted through a custom 50R/50T beamsplitter. IMU (InvenSense MPU-9250, 1 kHz, full triad) data are aligned in time. Full camera/IMU intrinsics and extrinsics (via Kalibr), including distortion coefficients, are provided; beam-splitter enables sub-pixel spatial alignment.

Sequences and Environments

Sixteen indoor sequences (~30–65 s each) cover diverse appearance and motion regimes: toy objects in variable lighting, floor-level navigation, fast and slow camera motions. Sequence names encode environment and illumination. Event rates span 0.3–2 Mev/s (with peaks higher on rapid motion), frames at 20–30 Hz, with individual clips reaching 100 M events and 1500 frames.

Data Organization

Each sequence includes HDF5 (or POColog) event arrays (timestamp, x, y, polarity), lossless frame images with metadata (exposure, gain), IMU CSVs, ground-truth pose (quaternion + translation per timestamp), and all calibration in YAML. ROS users are provided bag files compatible with sensor_msgs conventions. Event generation model (EGM) and alignment follow: ΔL(uk,tk)=L(uk,tk)L(uk,tkΔtk)=pkC\Delta L(\mathbf{u}_k,t_k) = L(\mathbf{u}_k, t_k) - L(\mathbf{u}_k, t_k - \Delta t_k) = p_k C where contrast threshold CC is per-pixel. Ground truth for most sequences is from a 36-camera Vicon/OptiTrack system (150 Hz); a subset uses AprilTag endpoint correction.

Benchmarking and Evaluation

Absolute trajectory error (ATE RMS, cm) and rotational RMSE (deg) are computed post-alignment (Sim(3)) following Zhang & Scaramuzza, IROS 2018. EDS outperforms prior event-based solutions and matches direct frame-based visual odometry (DSO) at typical rates, but when frames are downsampled to <10 Hz, EDS continues tracking with near-frame-quality via event streams (~60 FPS effective). This demonstrates the unique utility for low-rate, high-dynamic-range, low-power odometry.

Access and Reproducibility

Data, parsing scripts (Python, C++, ROS), supplementary code, and tools (libcaer, OpenCV, Eigen, Ceres, PCL) are available. Each sequence is released as a compressed archive (200 MB–2 GB).

Dataset Collection and Construction

“EDS datasets” in this context comprise a set of low-dimensional regression problems (synthetic and real-world dynamical systems) used to assess the Error Distribution Smoothing (EDS) algorithm. Problems include:

  • "f-surface" (synthetic, 2D features, rational function regression)
  • Lorenz system identification (state to state-derivatives)
  • Polar moment of inertia from rectangle images and geometric features
  • Cartpole dynamics (θ, ω, I → accelerations)
  • Quadcopter vertical dynamics (height, velocity, throttle → acceleration)

Dataset sizes range from 5,000–40,000 per split; all features/labels are standardized.

Complexity-to-Density Regions and Preprocessing

Feature space is partitioned via Delaunay triangulation; each simplex (“region” Ωᵢ) is characterized by maximal Hessian norm (gc(Ω)g_c(\Omega)), region diameter (gs(Ω)g_s(\Omega)), and density (number of samples). The complexity-to-density ratio (CDR) is: ρ(Ω,D)=gc(Ω)gs(Ω)ΩD\rho(\Omega, D) = \frac{g_c(\Omega)\,g_s(\Omega)}{|\Omega\cap D|} This ratio identifies regions of high model complexity and low density ("imbalanced"). Data are labeled as high/medium/low CDR by z-scores on the log-CDR distribution. No cleaning beyond standardization is required due to controlled generation.

EDS Algorithm and Representative Subsampling

EDS selects a representative subset DRD_R by first initializing with a random triangulation, then sequentially accepting new samples if: (a) they fall outside current simplices, or (b) their linear interpolation error exceeds a global log-error threshold ψ. This procedure is formalized as: minDRDDRs.t.μI(F,DR)+zσI(F,DR)ψ\min_{D_R\subseteq D}|D_R| \quad \text{s.t.} \quad \mu|_{I(F,D_R)}+z\sigma|_{I(F,D_R)}\leq \psi where μ, σ are mean and std of log-CDR over regions. This results in balanced coverage, especially in rare, high-complexity regions.

Experimental Protocols and Metrics

Regression algorithms include MLPs (MSE loss) and SINDy (polynomial + Lasso). Metrics reported are RMSE, maximum error, and train time. Baselines are: full data (D), random size-matched subset (D_M), and EDS-representative subset (D_R). EDS achieves lower maximum error and comparable or better RMSE, with significant reductions in training time—highlighting robust coverage of challenging input regions.

Access, Organization, and Reproducibility

All code, generation scripts, datasets (raw/standardized splits), and fixed seeds/hyperparameters are provided via a public repository. Directory structure encompasses data, scripts, algorithm code, and experiment templates.

4. Comparative Summary Table

Dataset Context Sensing/Modality Target Task/Utility
(Huang et al., 2023) ESD Dual DAVIS346 + D435 Event-based segmentation (3D/temporal, RGBD)
(Hidalgo-Carrió et al., 2022) EDS Prophesee, FLIR, IMU Event+frame visual/inertial odometry
(Chen et al., 4 Feb 2025) EDS Synthetic, dynamical Imbalanced regression, subset selection

Notably, the nomenclature “EDS” refers to distinct datasets/concepts across these works. Each brings unique assets and experimental rigor to its respective field.

5. Significance and Research Impact

The Event-based Segmentation Dataset constitutes the first large-scale, densely annotated 3D spatiotemporal benchmark for neuromorphic segmentation in cluttered indoor scenes (Huang et al., 2023), enabling precise evaluation of multimodal fusion, event-aware learning, and robustness to occlusion/blurring. The EDS visual odometry dataset delivers temporally synchronized, co-located event/frame/IMU data with ground truth, serving as a standard for direct event-based odometry research (Hidalgo-Carrió et al., 2022). The Error Distribution Smoothing datasets operationalize the challenge of imbalanced regression, supporting rigorous benchmarking of resampling and complexity-aware training in controlled settings (Chen et al., 4 Feb 2025). Collectively, these EDS datasets advance neuromorphic perception, dynamic scene understanding, and fair evaluation in low-dimensional regression modeling.

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to EDS Dataset.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube