Metrically Scaled Monocular Depth Estimation through Sparse Priors for Underwater Robots (2310.16750v1)

Published 25 Oct 2023 in cs.CV and cs.RO

Abstract: In this work, we address the problem of real-time dense depth estimation from monocular images for mobile underwater vehicles. We formulate a deep learning model that fuses sparse depth measurements from triangulated features to improve the depth predictions and solve the problem of scale ambiguity. To allow prior inputs of arbitrary sparsity, we apply a dense parameterization method. Our model extends recent state-of-the-art approaches to monocular image based depth estimation, using an efficient encoder-decoder backbone and modern lightweight transformer optimization stage to encode global context. The network is trained in a supervised fashion on the forward-looking underwater dataset, FLSea. Evaluation results on this dataset demonstrate significant improvement in depth prediction accuracy by the fusion of the sparse feature priors. In addition, without any retraining, our method achieves similar depth prediction accuracy on a downward looking dataset we collected with a diver operated camera rig, conducting a survey of a coral reef. The method achieves real-time performance, running at 160 FPS on a laptop GPU and 7 FPS on a single CPU core and is suitable for direct deployment on embedded systems. The implementation of this work is made publicly available at https://github.com/ebnerluca/uw_depth.

Citations (4)

View on Semantic Scholar

Summary

The paper introduces a novel method that integrates triangulated sparse depth cues to resolve scale ambiguity in underwater monocular images.
It leverages a MobileNetV2 encoder-decoder combined with a vision transformer to predict adaptive bin widths for enhanced range accuracy.
Quantitative evaluations on the FLSea dataset show RMSE improvements of 0.167m under five meters and 0.040m under one meter, confirming its efficiency.

Metrically Scaled Monocular Depth Estimation through Sparse Priors for Underwater Robots

The paper "Metrically Scaled Monocular Depth Estimation through Sparse Priors for Underwater Robots" presents a refined approach to solving the complex issue of dense depth estimation for autonomous underwater vehicles (AUVs). In environments where traditional active-light sensors such as LiDAR or RGB-D cameras face challenges owing to the unique optical properties of water, the presented method offers a viable alternative, leveraging monocular imagery with a significant incorporation of sparse priors.

Methodology and Contributions

This research builds on state-of-the-art methods in monocular depth estimation by integrating sparse depth cues derived from visual feature triangulation. Such cues help overcome the inherent scale ambiguity of monocular systems. The authors’ primary contributions are articulated in three areas:

Sparse Depth Priors Integration: By using triangulated feature points within a dense parameterization framework, the research provides the monocular depth estimation model with robust scale constraints. This integration is achieved through a novel parameterization strategy, rendering the model agnostic to the sparsity level of input priors.
Model Architecture: The extended architecture incorporates a MobileNetV2-based encoder-decoder framework and a vision transformer. This combination enriches feature representation while maintaining computational efficiency. Notably, the model predicts adaptive bin widths for improved range estimation accuracy.
Evaluation and Generalization: The proposed model demonstrates enhanced prediction accuracy on the FLSea dataset, particularly in scenarios involving optical challenges. It also generalizes well to environments like the Lizard Island coral reef dataset without additional training, showcasing its applicability to various underwater tasks.

Quantitative Results

The integration of sparse depth priors significantly enhances prediction accuracy across multiple metrics. Noteworthy improvements in RMSE, both in linear and logarithmic scales, are observed when sparse depth priors are used. Specifically, the method achieves RMSE values of 0.167 meters for ranges under five meters and 0.040 meters for ranges under one meter on the FLSea dataset. These results underscore the method's efficacy in precisely reconstructing underwater scenes, suitable for tasks that require close-range interaction, such as ecological surveys or manipulations.

Practical and Theoretical Implications

The paper opens avenues for deploying cost-effective and computationally efficient depth sensing technologies on underwater platforms. From a practical standpoint, this method's reliance on monocular cameras complimented by sparse priors furnishes an appealing deployment option for lightweight and low-cost systems. This characteristic aligns well with the operational constraints typical in underwater environments. Theoretically, this work addresses the scale ambiguity challenge in monocular depth estimation by elegantly leveraging data redundancy inherent in video sequences—this represents a step forward in model robustness across domains with varying visual characteristics.

Future Prospects

Future research directions could explore enhancements in prior generation accuracy, such as more sophisticated techniques to estimate depth uncertainty. Additionally, there is potential to delve into multi-task learning approaches that concurrently solve complementary tasks, such as semantic segmentation, which may improve feature extraction and provide further contextual insights necessary for depth estimation.

This paper effectively bridges a gap in underwater robotics by delivering a scalable and reliable depth estimation technique tailored for adverse underwater conditions. The capabilities demonstrated here highlight the transformative potential of blending sparse visual cues with learning-based frameworks for robust autonomous operations in previously challenging domains.

PDF Markdown

Related Papers

GitHub

GitHub - ebnerluca/uw_depth: Metrically Scaled Monocular Depth Estimation through Sparse Priors for Underwater Robots (48 stars)

YouTube

Show All Videos