Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Tri-Perspective View Decomposition for Geometry-Aware Depth Completion (2403.15008v1)

Published 22 Mar 2024 in cs.CV

Abstract: Depth completion is a vital task for autonomous driving, as it involves reconstructing the precise 3D geometry of a scene from sparse and noisy depth measurements. However, most existing methods either rely only on 2D depth representations or directly incorporate raw 3D point clouds for compensation, which are still insufficient to capture the fine-grained 3D geometry of the scene. To address this challenge, we introduce Tri-Perspective view Decomposition (TPVD), a novel framework that can explicitly model 3D geometry. In particular, (1) TPVD ingeniously decomposes the original point cloud into three 2D views, one of which corresponds to the sparse depth input. (2) We design TPV Fusion to update the 2D TPV features through recurrent 2D-3D-2D aggregation, where a Distance-Aware Spherical Convolution (DASC) is applied. (3) By adaptively choosing TPV affinitive neighbors, the newly proposed Geometric Spatial Propagation Network (GSPN) further improves the geometric consistency. As a result, our TPVD outperforms existing methods on KITTI, NYUv2, and SUN RGBD. Furthermore, we build a novel depth completion dataset named TOFDC, which is acquired by the time-of-flight (TOF) sensor and the color camera on smartphones. Project page: https://yanzq95.github.io/projectpage/TOFDC/index.html

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. nuscenes: A multimodal dataset for autonomous driving. In CVPR, pages 11621–11631, 2020.
  2. Agg-net: Attention guided gated-convolutional network for depth image completion. In ICCV, pages 8853–8862, 2023.
  3. Learning joint 2d-3d representations for depth completion. In ICCV, pages 10023–10032, 2019.
  4. S3cnet: A sparse semantic scene completion network for lidar point clouds. In CoRL, pages 2148–2161, 2021.
  5. Learning depth with convolutional spatial propagation network. In ECCV, pages 103–119, 2018.
  6. Learning depth with convolutional spatial propagation network. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(10):2361–2379, 2019.
  7. Cspn++: Learning context and resource aware convolutional spatial propagation networks for depth completion. In AAAI, pages 10615–10622, 2020.
  8. Confidence propagation through cnns for guided sparse depth regression. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(10):2423–2436, 2020.
  9. Virtual worlds as proxy for multi-object tracking analysis. In CVPR, pages 4340–4349, 2016.
  10. Are we ready for autonomous driving? the kitti vision benchmark suite. In CVPR, pages 3354–3361, 2012.
  11. Towards fast and accurate real-world depth super-resolution: Benchmark dataset and baseline. In CVPR, pages 9229–9238, 2021.
  12. Penet: Towards precise and efficient image guided depth completion. In ICRA, 2021.
  13. Deep networks with stochastic depth. In ECCV, pages 646–661. Springer, 2016.
  14. Tri-perspective view for vision-based 3d semantic occupancy prediction. In CVPR, pages 9223–9232, 2023.
  15. Boosting monocular depth estimation with lightweight 3d point fusion. In ICCV, pages 12767–12776, 2021.
  16. Depth completion with twin surface extrapolation at occlusion boundaries. In CVPR, pages 2583–2592, 2021.
  17. A category-level 3d object dataset: Putting the kinect to work. Consumer Depth Cameras for Computer Vision: Research Topics and Applications, pages 141–165, 2013.
  18. Sparse and dense data with cnns: Depth completion and semantic segmentation. In 3DV, pages 52–60, 2018.
  19. Costdcnet: Cost volume based depth completion for a single rgb-d image. In ECCV, pages 257–274. Springer, 2022.
  20. Adam: A method for stochastic optimization. In Computer Ence, 2014.
  21. In defense of classical image processing: Fast depth completion on the cpu. In CRV, pages 16–22, 2018.
  22. Dynamic spatial propagation network for depth completion. In AAAI, pages 1638–1646, 2022.
  23. Dyspn: Learning dynamic affinity for image-guided depth completion. IEEE Transactions on Circuits and Systems for Video Technology, pages 1–1, 2023.
  24. Single image depth prediction made better: A multivariate gaussian take. In CVPR, pages 17346–17356, 2023a.
  25. Fcfr-net: Feature fusion based coarse-to-fine residual learning for depth completion. In AAAI, pages 2136–2144, 2021.
  26. Mff-net: Towards efficient monocular depth completion with multi-modal feature fusion. IEEE Robotics and Automation Letters, 8(2):920–927, 2023b.
  27. Learning affinity via spatial propagation networks. In NeurIPS, 2017.
  28. Graphcspn: Geometry-aware depth completion via dynamic gcns. In ECCV, pages 90–107. Springer, 2022.
  29. From depth what can you see? depth completion via auxiliary image reconstruction. In CVPR, pages 11306–11315, 2020.
  30. Self-supervised sparse-to-dense: Self-supervised depth completion from lidar and monocular camera. In ICRA, 2019.
  31. A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In CVPR, pages 4040–4048, 2016.
  32. Non-local spatial propagation network for depth completion. In ECCV, 2020.
  33. Deeplidar: Deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image. In CVPR, pages 3313–3322, 2019.
  34. Guideformer: Transformers for image guided depth completion. In CVPR, pages 6250–6259, 2022.
  35. Structure-from-motion revisited. In CVPR, pages 4104–4113, 2016.
  36. Nddepth: Normal-distance assisted monocular depth estimation. In ICCV, pages 7931–7940, 2023.
  37. Indoor segmentation and support inference from rgbd images. In ECCV, pages 746–760. Springer, 2012.
  38. Sun rgb-d: A rgb-d scene understanding benchmark suite. In CVPR, pages 567–576, 2015.
  39. Learning guided convolutional network for depth completion. IEEE Transactions on Image Processing, 30:1116–1129, 2020.
  40. Sparsity invariant cnns. In 3DV, pages 11–20, 2017.
  41. Sparse and noisy lidar completion with rgb guidance and uncertainty. In MVA, pages 1–6, 2019.
  42. Regularizing nighttime weirdness: Efficient self-supervised monocular depth estimation in the dark. In ICCV, pages 16055–16064, 2021.
  43. Lrru: Long-short range recurrent updating networks for depth completion. In ICCV, pages 9422–9432, 2023.
  44. Unsupervised depth completion from visual inertial odometry. IEEE Robotics and Automation Letters, 5(2):1899–1906, 2020.
  45. Sun3d: A database of big spaces reconstructed using sfm and object labels. In ICCV, pages 1625–1632, 2013.
  46. Depth completion from sparse lidar data with depth-normal constraints. In ICCV, pages 2811–2820, 2019.
  47. Deformable spatial propagation networks for depth completion. In ICIP, pages 913–917. IEEE, 2020.
  48. Multi-modal masked pre-training for monocular panoramic depth completion. In ECCV, pages 378–395, 2022a.
  49. Learning complementary correlations for depth super-resolution with incomplete data in real world. IEEE Transactions on Neural Networks and Learning Systems, 2022b.
  50. Rignet: Repetitive image guided network for depth completion. In ECCV, pages 214–230, 2022c.
  51. Distortion and uncertainty aware loss for panoramic depth completion. In ICML, 2023a.
  52. Rignet++: Efficient repetitive image guided network for depth completion. arXiv preprint arXiv:2309.00655, 2023b.
  53. Desnet: Decomposed scale-consistent network for unsupervised depth completion. In AAAI, pages 3109–3117, 2023c.
  54. Learnable differencing center for nighttime depth perception. arXiv preprint arXiv:2306.14538, 2023d.
  55. Aggregating feature point cloud for depth completion. In ICCV, pages 8732–8743, 2023.
  56. Completionformer: Depth completion with convolutions and vision transformers. In CVPR, pages 18527–18536, 2023.
  57. Adaptive context-aware multi-modal network for depth completion. IEEE Transactions on Image Processing, 2021.
  58. Discrete cosine transform network for guided depth map super-resolution. In CVPR, pages 5697–5707, 2022.
  59. Spherical space feature decomposition for guided depth map super-resolution. In ICCV, pages 12547–12558, 2023.
  60. Steps: Joint self-supervised nighttime image enhancement and depth estimation. arXiv preprint arXiv:2302.01334, 2023.
  61. Bev@ dc: Bird’s-eye view assisted training for depth completion. In CVPR, pages 9233–9242, 2023.
  62. Pointocc: Cylindrical tri-perspective view for point-based 3d semantic occupancy prediction. arXiv preprint arXiv:2308.16896, 2023.
Citations (11)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com