Adaptive Learning for Multi-view Stereo Reconstruction (2404.05181v1)

Published 8 Apr 2024 in cs.CV

Abstract: Deep learning has recently demonstrated its excellent performance on the task of multi-view stereo (MVS). However, loss functions applied for deep MVS are rarely studied. In this paper, we first analyze existing loss functions' properties for deep depth based MVS approaches. Regression based loss leads to inaccurate continuous results by computing mathematical expectation, while classification based loss outputs discretized depth values. To this end, we then propose a novel loss function, named adaptive Wasserstein loss, which is able to narrow down the difference between the true and predicted probability distributions of depth. Besides, a simple but effective offset module is introduced to better achieve sub-pixel prediction accuracy. Extensive experiments on different benchmarks, including DTU, Tanks and Temples and BlendedMVS, show that the proposed method with the adaptive Wasserstein loss and the offset module achieves state-of-the-art performance.

References (39)

Summary

The paper introduces an adaptive Wasserstein loss and an offset module to ensure sub-pixel accuracy in depth predictions.
It overcomes limitations in regression and classification losses by aligning predicted and true depth distributions even with non-overlapping supports.
Empirical tests on DTU, Tanks and Temples, and BlendedMVS benchmarks demonstrate state-of-the-art performance and scalability.

Adaptive Learning for Multi-view Stereo Reconstruction Using Adaptive Wasserstein Loss and Offset Module

Introduction

Multi-view stereo (MVS) is crucial for generating dense 3D reconstructions from multiple images. While deep learning (DL)-based approaches have significantly advanced the field, the design of loss functions, a key component in DL models, has been relatively unexplored in deep MVS research. The paper addresses this gap by analyzing existing loss functions and proposing an adaptive Wasserstein loss combined with an offset module. This combination yields state-of-the-art performance on various benchmarks.

Analysis of Existing Loss Functions

Regression-based and Classification-based Loss

Existing deep MVS methods typically employ either regression-based or classification-based loss functions. Regression-based approaches predict a continuous depth value through the mathematical expectation, which can lead to inaccuracies in multi-modal distributions. Classification-based methodologies, on the other hand, produce discretized depth values, hindering the achievement of sub-pixel accuracy.

Novel Contributions

Adaptive Wasserstein Loss

The authors introduce an adaptive Wasserstein loss, facilitating the minimization of divergence between the true and predicted depth distributions, even when they do not share common supports. This loss function is especially effective for deep depth-based MVS, overcoming the limitations of Kullback-Leibler divergence used in classification-based approaches.

Offset Module

Additionally, an offset module is introduced to enhance prediction accuracy to sub-pixel levels. This module operates by predicting both a probability for fixed discrete depth values and an additional offset for each value, thereby resolving issues related to discretized outputs and enabling continuous depth value predictions.

Empirical Evaluation

Benchmarks and Results

The proposed method was rigorously tested across several benchmarks, including DTU, Tanks and Temples, and BlendedMVS datasets. On the DTU dataset, it achieved impressive results, with similar high performance noted on Tanks and Temples. Specifically, it outperformed $D^2$ HC-RMVSNet, which shares a similar architecture, thereby highlighting the effectiveness of the proposed loss function and offset module. Moreover, experiments on the BlendedMVS dataset demonstrated the method's scalability and practicality across diverse scenes.

Discussion

Benefits of Adaptive Wasserstein Loss and Offset Module

The adaptive Wasserstein loss addresses the shortcomings of previous loss functions by ensuring the predicted depth distribution closely aligns with the true distribution. The addition of the offset module allows for sub-pixel accuracy in depth predictions, a significant improvement over discretized outputs.

Future Implications

While the proposed method advances the performance of MVS reconstruction, future work could explore the integration of these principles into other 3D vision tasks. Additionally, further refinement of the offset module could enhance its effectiveness across a broader range of scenarios.

Conclusion

The paper presents a novel adaptive Wasserstein loss function combined with an offset module for deep MVS, significantly improving depth prediction accuracy and completeness. Through extensive experimentation, the method demonstrated superior performance on multiple benchmarks, offering promising directions for future research in 3D vision and MVS reconstruction.

PDF Markdown

Related Papers

Tweets

https://twitter.com/zhenjun_zhao/status/1777552988116316239