Self-Supervised Deep Depth Denoising (1909.01193v2)

Published 3 Sep 2019 in cs.CV

Abstract: Depth perception is considered an invaluable source of information for various vision tasks. However, depth maps acquired using consumer-level sensors still suffer from non-negligible noise. This fact has recently motivated researchers to exploit traditional filters, as well as the deep learning paradigm, in order to suppress the aforementioned non-uniform noise, while preserving geometric details. Despite the effort, deep depth denoising is still an open challenge mainly due to the lack of clean data that could be used as ground truth. In this paper, we propose a fully convolutional deep autoencoder that learns to denoise depth maps, surpassing the lack of ground truth data. Specifically, the proposed autoencoder exploits multiple views of the same scene from different points of view in order to learn to suppress noise in a self-supervised end-to-end manner using depth and color information during training, yet only depth during inference. To enforce selfsupervision, we leverage a differentiable rendering technique to exploit photometric supervision, which is further regularized using geometric and surface priors. As the proposed approach relies on raw data acquisition, a large RGB-D corpus is collected using Intel RealSense sensors. Complementary to a quantitative evaluation, we demonstrate the effectiveness of the proposed self-supervised denoising approach on established 3D reconstruction applications. Code is avalable at https://github.com/VCL3D/DeepDepthDenoising

Citations (43)

View on Semantic Scholar

Summary

The paper introduces a novel self-supervised autoencoder that leverages photometric consistency to denoise depth maps without requiring ground truth data.
It utilizes multi-view RGB-D inputs to preserve detail fidelity and outperforms traditional methods based on metrics like MAE and RMSE.
The approach offers promising applications in AR, robotics, and SLAM, setting a new benchmark for noise reduction in depth sensing.

Self-Supervised Deep Depth Denoising: A Technical Overview

Introduction

In recent developments in depth-sensing technologies, noise remains a significant challenge, even with advanced consumer-grade sensors. The paper "Self-Supervised Deep Depth Denoising" addresses this critical issue by proposing a fully convolutional deep autoencoder capable of denoising depth maps. This autoencoder operates in a self-supervised manner, alleviating the need for clean ground truth depth data typically challenging to obtain. By leveraging multiple viewpoints and photometric consistency through differentiable rendering, this approach represents a novel method to suppress noise while maintaining detail fidelity.

The Deep Autoencoder and Self-Supervision Approach

The architecture proposed in the paper follows the principles of deep learning, leveraging a convolutional autoencoder to model depth denoising inherently. This model capitalizes on the multi-view geometry concept by utilizing multiple RGB-D sensors to acquire input data. Each sensor provides a depth map and aligned color image, allowing the model to use photometric relationships as supervisory signals for learning noise patterns.

The underlying assumption of photometric consistency ensures that different viewpoints of the same scene can be used to enforce self-supervision without necessitating ground truth depth maps. Forward splatting allows color information to accumulate across views, maintaining differentiability and thereby enabling efficient backpropagation through the network. This technique addresses common challenges like depth occlusions and view-dependent noise, establishing a robust framework for depth denoising.

Evaluation and Comparisons

Quantitative and qualitative assessments arouse from testing the algorithm on data captured by Intel RealSense D415 sensors, alongside comparisons against classical and state-of-the-art denoising methods such as Bilateral Filters and DDRNet. Notably, the paper presents its method outperforming these alternatives across several metrics, including Mean Absolute Error (MAE) and Root Mean Square Error (RMSE). These results underline the proposed approach's effectiveness in reducing noise while preserving important geometric details. The experimental setup included a structured and calibrated multi-sensor array for comprehensive data collection.

Implications and Future Directions

The implications of this research extend significant benefits to various fields reliant on precise depth information, including augmented reality, robotics, and autonomous systems. The capability to achieve high-quality depth perception without requiring painstakingly acquired ground truth datasets marks a forward step towards more autonomous machine learning models.

Future developments may aim to optimize the computational efficiency of this model further and explore its adaptability to other sensor types or environmental conditions. Additionally, integrating this model within broader systems such as Simultaneous Localization and Mapping (SLAM) frameworks could amplify its practical applications.

Conclusion

This paper demonstrates a competent stride towards addressing noise within depth sensing through self-supervised deep learning. The deployment of a deep autoencoder fueled by photometric consistency between multi-viewpoint depth data establishes a model that effectively denoises while ensuring geometric detail retention. These innovations suggest promising avenues for advancing AI applications that depend heavily on accurate depth data. The provided resources as open-source will propel further research and development in the domain of depth map denoising.

PDF Markdown

Related Papers

GitHub

GitHub - VCL3D/DeepDepthDenoising: This repo includes the source code of the fully convolutional depth denoising model presented in https://arxiv.org/pdf/1909.01193.pdf (ICCV19) (132 stars)

Tweets

https://twitter.com/VRTogether_EU/status/1188773670329683969

https://twitter.com/tofis3d/status/1188345549453516805