AquaticVision: Benchmarking Visual SLAM in Underwater Environment with Events and Frames (2505.03448v2)

Published 6 May 2025 in cs.RO

Abstract: Many underwater applications, such as offshore asset inspections, rely on visual inspection and detailed 3D reconstruction. Recent advancements in underwater visual SLAM systems for aquatic environments have garnered significant attention in marine robotics research. However, existing underwater visual SLAM datasets often lack groundtruth trajectory data, making it difficult to objectively compare the performance of different SLAM algorithms based solely on qualitative results or COLMAP reconstruction. In this paper, we present a novel underwater dataset that includes ground truth trajectory data obtained using a motion capture system. Additionally, for the first time, we release visual data that includes both events and frames for benchmarking underwater visual positioning. By providing event camera data, we aim to facilitate the development of more robust and advanced underwater visual SLAM algorithms. The use of event cameras can help mitigate challenges posed by extremely low light or hazy underwater conditions. The webpage of our dataset is https://sites.google.com/view/aquaticvision-lias.

Authors (5)

Yifan Peng (147 papers)
Yuze Hong (4 papers)
Ziyang Hong (13 papers)
Apple Pui-Yi Chui (1 paper)
Junfeng Wu (71 papers)

Summary

Overview of AquaticVision: Benchmarking Visual SLAM in Underwater Environment with Events and Frames

The paper "AquaticVision: Benchmarking Visual SLAM in Underwater Environment with Events and Frames" introduces a novel dataset aimed at advancing the paper of underwater visual simultaneous localization and mapping (SLAM) technologies. Reflecting the increasing significance of underwater robotics applications, the dataset is tailored to address the distinctive challenges posed by aquatic environments, such as unpredictable lighting, variable water clarity, and unstructured textures.

Underwater visual SLAM has been a focal point in marine robotics, offering essential capabilities for navigation and environmental perception in GPS-denied settings. However, existing datasets often lack robust groundtruth trajectory data necessary for validating SLAM systems. This paper confronts such limitations by providing a comprehensive underwater dataset incorporating event camera data, frames, and IMU measurements, alongside accurate groundtruth trajectories using a motion capture system.

Dataset Composition

The dataset features data collected from sensors such as the DAVIS346 stereo event camera, providing high-temporal-resolution event data synchronized with grayscale images and IMU data. Thanks to the integration of event cameras, the dataset facilitates research into overcoming adverse underwater conditions frequently encountered due to poor lighting and water turbidity. The experimental setup in a controlled pool environment enables the capture of data sequences across varied conditions--categorized as either easy or challenging--to simulate different underwater scenarios.

SLAM System Evaluation

The paper undertakes a comparative analysis of several SLAM systems, leveraging the dataset to assess their efficacy across different sequences. Frame-based SLAM systems such as VINS-Stereo and ORB-SLAM2 are evaluated, underscoring the vital role of combining visual data with inertial measurements for enhancing pose estimation accuracy, particularly in difficult environments with dynamic lighting changes. VINS-Stereo, with IMU assistance, demonstrates greater robustness compared to ORB-SLAM2 under similar aquatic settings. However, challenges persist for both systems, specifically in environments characterized by high dynamic range or turbid water.

Additionally, the paper reviews ESVO2, a state-of-the-art event-based SLAM system, which, due to calibration challenges and environmental complications, fails to operate effectively across most sequences. This evaluation, however, highlights the potential of event cameras in underwater SLAM systems, particularly in improving resilience against lighting variations and detecting environmental features.

Implications and Future Directions

The dataset not only serves as a benchmark for current evaluation of underwater SLAM systems but also opens avenues for further innovation. Future developments could explore adaptive sensor fusion techniques, combining the strengths of event cameras and traditional frame-based approaches to better address underwater SLAM challenges. Enhancing calibration methods, optimizing stereo camera setups, and developing more sophisticated event representation techniques bear potential for improving SLAM accuracy under complex underwater conditions.

The release of this dataset, combined with rigorous performance evaluations of established SLAM methodologies, establishes a foundational resource for aquatic vision research and propels the technological advancement of intelligent underwater robotics systems. Continued updates and expansions of such benchmark datasets will likely contribute significantly to overcoming the technical challenges facing underwater navigation and exploration.

Related Papers

Find Related Papers

YouTube

Show All Videos