A Photometrically Calibrated Benchmark For Monocular Visual Odometry (1607.02555v2)

Published 9 Jul 2016 in cs.CV

Abstract: We present a dataset for evaluating the tracking accuracy of monocular visual odometry and SLAM methods. It contains 50 real-world sequences comprising more than 100 minutes of video, recorded across dozens of different environments -- ranging from narrow indoor corridors to wide outdoor scenes. All sequences contain mostly exploring camera motion, starting and ending at the same position. This allows to evaluate tracking accuracy via the accumulated drift from start to end, without requiring ground truth for the full sequence. In contrast to existing datasets, all sequences are photometrically calibrated. We provide exposure times for each frame as reported by the sensor, the camera response function, and dense lens attenuation factors. We also propose a novel, simple approach to non-parametric vignette calibration, which requires minimal set-up and is easy to reproduce. Finally, we thoroughly evaluate two existing methods (ORB-SLAM and DSO) on the dataset, including an analysis of the effect of image resolution, camera field of view, and the camera motion direction.

Citations (200)

View on Semantic Scholar

Summary

The paper presents a novel photometric calibration framework that enhances the accuracy of monocular VO and SLAM evaluations.
It offers a diverse dataset of 50 video sequences with detailed exposure, response functions, and non-parametric vignette corrections.
The study benchmarks systems like ORB-SLAM and DSO using loop-closure trajectories to effectively measure drift and performance.

A Photometrically Calibrated Benchmark for Monocular Visual Odometry

The paper introduces a comprehensive dataset aimed at advancing the evaluation of monocular visual odometry (VO) and simultaneous localization and mapping (SLAM) methodologies. Authored by Jakob Engel, Vladyslav Usenko, and Daniel Cremers, this research provides a novel benchmark that addresses key limitations in current datasets through photometric calibration and diverse environmental recording.

Dataset Characteristics

The dataset featured encompasses 50 video sequences over more than 100 minutes, captured in varied environments, ranging from indoor corridors to expansive outdoor areas. A significant innovation is the photometric calibration of each sequence. This includes frame-specific exposure times, camera response functions, and dense lens attenuation factors. Such calibration is pivotal because it allows for more accurate algorithm evaluation by accounting for aspects like vignetting and non-linear camera responses, which are often neglected in conventional datasets.

Calibration Methodology

The dataset distinguishes itself by employing a novel approach to non-parametric vignette calibration that is minimalistic yet effective. Traditional methods using parametric models often fail to account for the complex vignetting patterns of real lenses. Instead, this approach involves capturing images of a planar, uniformly colored surface across a variety of exposure settings to extract dense vignetting correction terms.

Evaluation Framework

An intriguing feature of this dataset is the loop-closure design of the sequences, where all videos initiate and conclude at the same position. This design facilitates the evaluation of tracking accuracy by measuring accumulated drift over the sequence without relying on external ground truth systems. This method overcomes the typical constraints of ground truth generation in diverse environments, where traditional GPS and indoor positioning systems may not be feasible.

Benchmarking Existing Methods

The paper conducts an in-depth evaluation of two existing monocular VO/SLAM systems, ORB-SLAM and DSO, using the proposed dataset. The results are analyzed concerning factors such as image resolution, camera field of view, and motion directions. The evaluation reveals variations in performance owing to these factors and assesses both methods for their robustness and accuracy over an extensive variety of scenarios.

Numerical Results and Analysis

Quantitative analysis through error metrics like alignment error and translational RMSE provides insights into the performance strengths and deficiencies of the tested methods. For instance, the alignment error metric effectively captures drift across sequences, offering a more stable measure than simple translational drift metrics, which can be disproportionately affected by sequence-specific factors.

Implications and Future Directions

This research provides a robust framework for the evaluation of monocular VO and SLAM algorithms with implications extending to real-world applications such as autonomous vehicles and augmented reality systems. By factoring photometric calibration into VO/SLAM evaluation, the researchers emphasize the importance of leveraging sensor capabilities to improve algorithm precision.

Future developments in this area could involve extending this benchmark to support stereo and RGB-D modalities and incorporating dynamic scenes with moving objects to evaluate algorithmic robustness further. Moreover, integrating machine learning approaches to dynamically adapt algorithms based on real-time calibration data represents an emergent frontier.

In conclusion, the publication supplies a pivotal resource that addresses the photometric and environmental diversity gaps in current monocular VO datasets. By aligning the evaluation framework more closely with the practical application needs and sensor capabilities, this benchmark establishes a new standard for algorithm evaluation and innovation in visual odometry and SLAM technology.

PDF Markdown

Related Papers

Tweets

https://twitter.com/7etsuo/status/1848627418892050730

YouTube

Show All Videos