Event-aided Direct Sparse Odometry (EDS)

Updated 16 May 2026

EDS is a visual odometry technique that fuses high-temporal-resolution event data with standard image frames to enable robust state estimation.
It employs a direct probabilistic formulation and sliding-window photometric bundle adjustment for precise motion estimation and semi-dense 3D mapping.
EDS demonstrates low RMS translational and rotational errors in both indoor and high-dynamic scenarios, outperforming traditional frame-based methods.

Event-aided Direct Sparse Odometry (EDS) is a class of visual odometry (VO) algorithms that directly fuses asynchronous event streams from event cameras with standard image frames (and optionally depth) to estimate 6-DoF camera motion and reconstruct semi-dense 3D maps. By leveraging the high temporal resolution, high dynamic range, and blur resilience of event cameras, EDS provides robust, low-latency odometry even under rapid motion, sparse frames, or challenging illumination. The technique uniquely integrates a direct probabilistic formulation of per-pixel brightness increments predicted via sparse 3D structure and jointly optimized with image-based photometric constraints, enabling accurate and efficient state estimation in scenarios where conventional frame-based VO and SLAM struggle (Hidalgo-Carrió et al., 2022, Zhu et al., 2023).

1. Event Generation and Signal Model

EDS exploits the event generation principle of event cameras: each pixel asynchronously triggers an event $e_k = (u_k, t_k, p_k)$ whenever the local logarithmic intensity $L(u, t)$ changes by a predefined contrast threshold $C$ :

$\Delta L(u_k, t_k) = L(u_k, t_k) - L(u_k, t_k - \Delta t_k) = p_k C,\quad p_k \in \{+1, -1\}.$

A first-order Taylor expansion under the optical flow assumption relates observed brightness increments to pixel velocities induced by rigid body motion:

$\Delta L(u) \approx -\nabla L(u) \cdot v(u) \Delta t,$

with image-plane velocity $v(u)$ parameterized by camera body velocity $(V, \omega)$ via:

$v(u) = J(u, Z) \begin{pmatrix} V \ \omega \end{pmatrix},$

where $J(u, Z)$ encodes projection geometry and depth $Z$ . The observed polarity-weighted event increments are accumulated with Gaussian temporal weighting to minimize motion blur:

$L(u, t)$ 0

A probabilistic generative model describes the likelihood of each event given the predicted increment:

$L(u, t)$ 1

where $L(u, t)$ 2 denotes the standard normal CDF and $L(u, t)$ 3 captures sensor/event noise (Hidalgo-Carrió et al., 2022).

2. Direct Probabilistic Motion Formulation

For each active pixel (with high gradient and sufficient events), EDS defines a brightness increment residual:

$L(u, t)$ 4

where $L(u, t)$ 5 is the model-predicted brightness increment under a given motion hypothesis. The weighted least squares objective

$L(u, t)$ 6

is minimized over SE(3) increments via Gauss-Newton or Levenberg–Marquardt, with robust (Huber) per-pixel weights to attenuate outliers. Parameters are iteratively updated in a small-angle, Lie-algebra parametrization for numerical stability (Hidalgo-Carrió et al., 2022).

3. Sparse 3D Structure Selection and Parameterization

EDS implements a semi-dense mapping strategy: each keyframe selects a sparse set of high-gradient pixels, dividing the frame into tiles and retaining those with the highest Sobel gradient magnitude (typically top 10–15% per tile). The 3D structure is parameterized via inverse depth $L(u, t)$ 7, initialized by reprojecting depths from already-mapped keyframes and interpolated using inverse-depth kd-tree search. This approach provides a computationally efficient yet geometrically informative set of points for direct photometric and event-based optimization (Hidalgo-Carrió et al., 2022).

4. Global Optimization: Photometric Bundle Adjustment

For map and trajectory refinement, EDS employs a sliding-window photometric bundle adjustment (BA) over all active keyframes. The optimization jointly refines all poses $L(u, t)$ 8 and all inverse-depths $L(u, t)$ 9 by minimizing robust semi-dense photometric error:

$C$ 0

where $C$ 1 is the 3D point at pixel $C$ 2 in keyframe $C$ 3, $C$ 4 is the projection, $C$ 5 is the intensity in keyframe $C$ 6, and $C$ 7 is the reprojected event-induced increment. Huber costs mitigate the influence of degenerate correspondences or outliers. This BA typically operates on windows of $C$ 87 keyframes and is implemented using automatic differentiation in Ceres (Hidalgo-Carrió et al., 2022).

5. Algorithmic Workflow

The end-to-end EDS pipeline consists of the following loop:

Initialization: Coarse DSO-like bootstrapping on initial frames.
Frontend tracking (events): For each incoming frame, collect event packets; accumulate $C$ 9; perform incremental motion tracking via direct optimization.
Keyframe management: Trigger new keyframes when tracked coverage falls or rotation threshold is exceeded; select new sparse points and initialize depths.
Backend mapping: Perform sliding-window semi-dense photometric BA; update the map.
Event batching: Events are processed in overlapping packets (e.g., 20k events, 50% overlap) to optimize signal-to-noise ratio while minimizing blur (Hidalgo-Carrió et al., 2022).

Performance is maintained at $\Delta L(u_k, t_k) = L(u_k, t_k) - L(u_k, t_k - \Delta t_k) = p_k C,\quad p_k \in \{+1, -1\}.$ 060 Hz in the frontend and $\Delta L(u_k, t_k) = L(u_k, t_k) - L(u_k, t_k - \Delta t_k) = p_k C,\quad p_k \in \{+1, -1\}.$ 120 Hz for frames; sparse frames are sufficient due to the event stream bridging intervals (“blind time”).

6. Extension to RGB-D Data and Adaptive Event Surfaces

A variant EDS pipeline for robotics fuses RGB-D frames with events. An adaptive time surface (ATS) addresses TS “whiteout”/“blackout” by deploying pixel-wise, motion-adaptive decay rates:

$\Delta L(u_k, t_k) = L(u_k, t_k) - L(u_k, t_k - \Delta t_k) = p_k C,\quad p_k \in \{+1, -1\}.$ 2

$\Delta L(u_k, t_k) = L(u_k, t_k) - L(u_k, t_k - \Delta t_k) = p_k C,\quad p_k \in \{+1, -1\}.$ 3

Pixel selection from the ATS then prioritizes spatially well-distributed, high-contrast, high-gradient regions. The full EDS objective jointly aligns RGB-D patch photometric errors and event-ATS patch errors. The final energy is

$\Delta L(u_k, t_k) = L(u_k, t_k) - L(u_k, t_k - \Delta t_k) = p_k C,\quad p_k \in \{+1, -1\}.$ 4

integrating both modalities with regularization (Zhu et al., 2023).

7. Benchmark Performance and Application Scenarios

On indoor DAVIS-based benchmarks, monocular EDS achieves 1–2 cm RMS translational error and 1–2° RMS rotational error, outperforming monocular event-only VO (EVO, USLAM) and matching or slightly exceeding DSO and ORB-SLAM under normal frame rates. When frame rates are reduced from 20 Hz to 5 Hz, frame-only methods degrade or lose track, whereas EDS remains robust due to continuous event tracking. In robotics applications, EDS with ATS demonstrates ATE below 2 cm and competitive relative pose error (RPE) across high-dynamics tasks (e.g., bounding/backflipping quadruped robots, angular rates up to 510 °/s) on datasets where classical methods diverge (Hidalgo-Carrió et al., 2022, Zhu et al., 2023).

A summary of results is as follows:

Dataset/Scenario	EDS Translational Error / ATE	Best Baseline
Indoor DAVIS (bin/boxes/etc)	1–2 cm rms / 1–2° rot error	DSO/ORB-SLAM (similar)
MVSEC flying (robot)	1.2–1.75 cm ATE	DEVO 2.4–7.1 cm
Mini-Cheetah (bounding)	0.42 cm ATE	Baselines diverged

EDS thus enables low-power, high-dynamic-range, and robust odometry for AR/VR, nano-UAVs, legged robots, and other environments where traditional frame-based VO is challenged by lighting, speed, or frame rate constraints (Hidalgo-Carrió et al., 2022, Zhu et al., 2023).

Markdown Report Issue Upgrade to Chat

References (2)

Event-aided Direct Sparse Odometry (2022)

Event Camera-based Visual Odometry for Dynamic Motion Tracking of a Legged Robot Using Adaptive Time Surface (2023)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Event-aided Direct Sparse Odometry (EDS).