Papers
Topics
Authors
Recent
2000 character limit reached

DDL-MVS: Depth Discontinuity Learning for MVS Networks

Published 2 Mar 2022 in cs.CV and cs.LG | (2203.01391v3)

Abstract: Traditional MVS methods have good accuracy but struggle with completeness, while recently developed learning-based multi-view stereo (MVS) techniques have improved completeness except accuracy being compromised. We propose depth discontinuity learning for MVS methods, which further improves accuracy while retaining the completeness of the reconstruction. Our idea is to jointly estimate the depth and boundary maps where the boundary maps are explicitly used for further refinement of the depth maps. We validate our idea and demonstrate that our strategies can be easily integrated into the existing learning-based MVS pipeline where the reconstruction depends on high-quality depth map estimation. Extensive experiments on various datasets show that our method improves reconstruction quality compared to baseline. Experiments also demonstrate that the presented model and strategies have good generalization capabilities. The source code will be available soon.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. Lemaire, C. Aspects of the DSM production with high resolution images. ISPRS 2008, 37, 1143–1146.
  2. Automated co-registration and calibration in SfM photogrammetry for landslide change detection. Earth Surf. Process. Landf. 2019, 44, 287–303.
  3. Modeling urban scenes from pointclouds. ICCV. IEEE, 2017, pp. 3857–3866.
  4. Accurate, dense, and robust multi-view stereopsis. IEEE TPAMI 2010, 32, 1362–1376.
  5. Massively parallel multiview stereopsis by surface normal diffusion. ICCV. IEEE, 2015, pp. 873–881.
  6. Efficient large-scale multi-view stereo for ultra high-resolution image sets. Machine vision and applications 2012, 23, 903–920.
  7. MVSNet: Depth inference for unstructured multi-view stereo. ECCV. Springer, 2018, pp. 767–783.
  8. Recurrent MVSNet for high-resolution multi-view stereo depth inference. CVPR. IEEE, 2019, pp. 5525–5534.
  9. SurfaceNet: An end-to-end 3D neural network for multiview stereopsis. ICCV. IEEE, 2017, pp. 2307–2315.
  10. Point-based multi-view stereo network. ICCV, 2019, pp. 1538–1547.
  11. Fast-MVSNet: Sparse-to-dense multi-view stereo with learned propagation and Gauss-Newton refinement. CVPR. IEEE, 2020, pp. 1949–1958.
  12. Deep stereo using adaptive thin volume representation with uncertainty awareness. CVPR. IEEE, 2020, pp. 2524–2534.
  13. Cascade cost volume for high-resolution multi-view stereo and stereo matching. CVPR. IEEE, 2020, pp. 2495–2504.
  14. P-MVSNet: Learning patch-wise matching confidence aggregation for multi-view stereo. ICCV. IEEE, 2019, pp. 10451–10460.
  15. Learning inverse depth regression for multi-view stereo with correlation cost volume. AAAI, 2020, Vol. 34, pp. 12508–12515.
  16. Cost volume pyramid based depth inference for multi-view stereo. CVPR. IEEE, 2020, pp. 4877–4886.
  17. Patchmatchnet: Learned multi-view patchmatch stereo. CVPR. IEEE, 2021, pp. 14194–14203.
  18. DeepPruner: Learning efficient stereo matching via differentiable patchmatch. ICCV. IEEE, 2019, pp. 4384–4393.
  19. The edge of depth: Explicit constraints between segmentation and depth. CVPR. IEEE, 2020, pp. 13116–13125.
  20. SMD-Nets: Stereo mixture density networks. CVPR. IEEE, 2021, pp. 8942–8952.
  21. Fast approximate energy minimization via graph cuts. IEEE TPAMI 2001, 23, 1222–1239.
  22. Markov random fields with efficient approximations. CVPR. IEEE, 1998, pp. 648–655.
  23. Wasserstein distances for stereo disparity estimation. NeurIPS, 2020, Vol. 33, pp. 22517–22529.
  24. Computer vision for autonomous vehicles: Problems, datasets and state of the art. Foundations and Trends® in Computer Graphics and Vision 2020, 12, 1–308.
  25. A theory of shape by space carving. IJCV 2000, 38, 199–218.
  26. Variational principles, surface evolution, PDE’s, level set methods and the stereo problem; IEEE, 2002.
  27. Marching cubes: A high resolution 3D surface construction algorithm. ACM siggraph computer graphics 1987, 21, 163–169.
  28. A volumetric method for building complex models from range images. Computer graphics and interactive techniques, 1996, pp. 303–312.
  29. A globally optimal algorithm for robust tv-l 1 range image integration. ICCV. IEEE, 2007, pp. 1–8.
  30. Collins, R.T. A space-sweep approach to true multi-image matching. CVPR. IEEE, 1996, pp. 358–363.
  31. Detailed real-time urban 3d reconstruction from video. IJCV 2008, 78, 143–167.
  32. Pixelwise view selection for unstructured multi-view stereo. ECCV. Springer, 2016.
  33. Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. 2016, 17, 2287–2318.
  34. End-to-end learning of geometry and context for deep stereo regression. ICCV. IEEE, 2017, pp. 66–75.
  35. Pyramid stereo matching network. CVPR. IEEE, 2018, pp. 5410–5418.
  36. Hierarchical deep stereo matching on high-resolution images. CVPR. IEEE, 2019, pp. 5515–5524.
  37. Edgestereo: A context integrated residual pyramid network for stereo matching. ACCV. Springer, 2018.
  38. High-Resolution Multi-View Stereo with Dynamic Depth Edge Flow. ICME. IEEE, 2021, pp. 1–6.
  39. Adaptive Range guided Multi-view Depth Estimation with Normal Ranking Loss. ACCV, 2022, pp. 1892–1908.
  40. ElasticMVS: Learning elastic part representation for self-supervised multi-view stereopsis. NeurIPS 2022, 35, 23510–23523.
  41. MG-MVSNet: Multiple Granularities Feature Fusion Network for Multi-View Stereo. Neurocomputing 2023.
  42. U-net: Convolutional networks for biomedical image segmentation. Medical Image Computing and Computer-Assisted Intervention. Springer, 2015, pp. 234–241.
  43. Feature pyramid networks for object detection. CVPR. IEEE, 2017, pp. 2117–2125.
  44. Depth map super-resolution by deep multi-scale guidance. ECCV. Springer, 2016.
  45. Attention aware cost volume pyramid based multi-view stereo network for 3D reconstruction. ISPRS J. of Photogrammetry and Remote Sensing 2021, 175, 448–460.
  46. Statistics of range images. CVPR. IEEE, 2000, Vol. 1, pp. 324–331 vol.1.
  47. Laplace, P.S. Laplace distribution. Encyclopedia of Mathematics 1801. Original publication in 1801, available in English translation.
  48. Large-scale data for multiple-view stereopsis. IJCV 2016, pp. 1–16.
  49. Laplace, P.S. Laplace operator. Encyclopedia of Mathematics 1820. Original publication in 1820, available in English translation.
  50. Tanks and Temples: Benchmarking large-scale scene reconstruction. ACM TOG 2017, 36.
  51. A Multi-View Stereo Benchmark with high-resolution images and multi-camera videos. CVPR. IEEE, 2017.
  52. Blendedmvs: A large-scale dataset for generalized multi-view stereo networks. CVPR. IEEE, 2020, pp. 1790–1799.
  53. Rethinking depth estimation for multi-view stereo: A unified representation. CVPR. IEEE, 2022, pp. 8645–8654.
  54. MVSTER: epipolar transformer for efficient multi-view stereo. ECCV. Springer, 2022, pp. 573–591.
  55. Using Multiple Hypotheses to Improve Depth-Maps for Multi-View Stereo. ECCV; Forsyth, D.; Torr, P.; Zisserman, A., Eds.; Springer Berlin Heidelberg: Berlin, Heidelberg, 2008; pp. 766–779.
  56. Attention-aware multi-view stereo. CVPR. IEEE, 2020.
  57. Visibility-aware multi-view stereo network. BMVC, 2020.
  58. EPP-MVSNet: Epipolar-assembling based depth prediction for multi-view stereo. ICCV. IEEE, 2021, pp. 5732–5740.
  59. AA-RMVSNet: Adaptive aggregation recurrent multi-view stereo network. ICCV. IEEE, 2021, pp. 6187–6196.
  60. Holistically-nested edge detection. ICCV. IEEE, 2015, pp. 1395–1403.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.