Transferability of Low-Resolution Benchmark Conclusions to High-Resolution Settings in Monocular Depth Estimation

Determine whether conclusions drawn from evaluations on low-resolution monocular depth estimation benchmarks (approximately 500×500 image resolution) safely transfer to high-resolution benchmarks (e.g., around 1000×2000), including whether performance comparisons and insights remain valid when image resolution is substantially increased.

Background

The paper argues that widely used monocular depth estimation test benchmarks often contain noisy labels and limited scene diversity. A further limitation the authors identify is that these benchmarks predominantly consist of low-resolution images, typically around 500×500 pixels.

Given modern camera usage and application needs, higher-resolution inputs (e.g., around 1000×2000) are more relevant. The authors explicitly note uncertainty about whether conclusions from low-resolution evaluations can be extrapolated to higher-resolution scenarios, motivating the construction of their DA-2K benchmark to better reflect real-world, high-resolution use cases.

References

It remains unclear whether the conclusions drawn from these low-resolution benchmarks can be safely transferred to high-resolution benchmarks.

Depth Anything V2  (2406.09414 - Yang et al., 2024) in Section: A New Evaluation Benchmark: DA-2K, Subsection: Limitations in Existing Benchmarks