RigNet++: Semantic Assisted Repetitive Image Guided Network for Depth Completion (2309.00655v4)
Abstract: Depth completion aims to recover dense depth maps from sparse ones, where color images are often used to facilitate this task. Recent depth methods primarily focus on image guided learning frameworks. However, blurry guidance in the image and unclear structure in the depth still impede their performance. To tackle these challenges, we explore a repetitive design in our image guided network to gradually and sufficiently recover depth values. Specifically, the repetition is embodied in both the image guidance branch and depth generation branch. In the former branch, we design a dense repetitive hourglass network (DRHN) to extract discriminative image features of complex environments, which can provide powerful contextual instruction for depth prediction. In the latter branch, we present a repetitive guidance (RG) module based on dynamic convolution, in which an efficient convolution factorization is proposed to reduce the complexity while modeling high-frequency structures progressively. Furthermore, in the semantic guidance branch, we utilize the well-known large vision model, i.e., segment anything (SAM), to supply RG with semantic prior. In addition, we propose a region-aware spatial propagation network (RASPN) for further depth refinement based on the semantic prior constraint. Finally, we collect a new dataset termed TOFDC for the depth completion task, which is acquired by the time-of-flight (TOF) sensor and the color camera on smartphones. Extensive experiments demonstrate that our method achieves state-of-the-art performance on KITTI, NYUv2, Matterport3D, 3D60, VKITTI, and our TOFDC.
- In: CVPR Workshops, pp. 3722–3732. IEEE (2021)
- Cyberpsychology & Behavior 11(1), 9–15 (2008)
- In: CVPR, pp. 6154–6162 (2018)
- In: ICCV, pp. 8853–8862 (2023)
- In: ICCV, pp. 3367–3375 (2023)
- In: ICCV, pp. 10023–10032 (2019)
- In: AAAI, pp. 10615–10622 (2020)
- In: ECCV, pp. 103–119 (2018)
- In: ACCV, pp. 499–513 (2018)
- In: ICRA, pp. 6087–6093 (2019)
- In: ISMAR, pp. 187–196 (2012)
- arXiv preprint arXiv:2010.11929 (2020)
- In: CoRL, pp. 1–16. PMLR (2017)
- IEEE Transactions on Pattern Analysis and Machine Intelligence 42(10), 2423–2436 (2020)
- In: CVPR, pp. 4340–4349 (2016)
- In: ECCV, pp. 658–676. Springer (2020)
- In: CVPR, pp. 7036–7045 (2019)
- In: AISTATS, pp. 315–323. JMLR Workshop and Conference Proceedings (2011)
- Image and Vision Computing 68, 14–27 (2017)
- In: CVPR, pp. 16000–16009 (2022)
- In: CVPR, pp. 770–778 (2016)
- arXiv preprint arXiv:1704.04861 (2017)
- In: CVPR, pp. 7132–7141 (2018)
- In: ICRA (2021)
- In: CVPR, pp. 4700–4708 (2017)
- arXiv preprint arXiv:2304.14660 (2023)
- In: ICCV Workshops, pp. 0–0 (2019)
- In: ICCV, pp. 12767–12776 (2021)
- In: CVPR, pp. 2583–2592 (2021)
- In: ICML, pp. 448–456. PMLR (2015)
- In: 3DV, pp. 52–60 (2018)
- IEEE Robotics and Automation Letters 6(2), 1519–1526 (2021)
- arXiv preprint arXiv:1412.6980 (2014)
- arXiv preprint arXiv:2304.02643 (2023)
- In: CRV, pp. 16–22 (2018)
- In: CVPR, pp. 13916–13925 (2021)
- In: SIGGRAPH, pp. 689–694. ACM (2004)
- In: WACV, pp. 32–40 (2020)
- arXiv preprint arXiv:2309.02270 (2023)
- In: CVPR, pp. 510–519 (2019)
- In: NeurIPS, vol. 35, pp. 12934–12949 (2022)
- In: ICCV, pp. 16889–16900 (2023)
- arXiv preprint arXiv:2311.15707 (2023)
- In: CVPR, pp. 2117–2125 (2017)
- In: AAAI, vol. 36, pp. 1638–1646 (2022)
- IEEE Transactions on Circuits and Systems for Video Technology (2023)
- In: AAAI, vol. 35, pp. 2136–2144 (2021)
- IEEE Robotics and Automation Letters 8(2), 920–927 (2023)
- In: NeurIPS, vol. 30 (2017)
- In: CVPR, pp. 8759–8768 (2018)
- In: ECCV, pp. 90–107. Springer (2022)
- In: AAAI, vol. 34, pp. 11653–11660 (2020)
- In: CVPR, pp. 11306–11315 (2020)
- In: ICRA (2019)
- In: ICRA, pp. 4796–4803. IEEE (2018)
- arXiv preprint arXiv:2304.12306 (2023)
- Medical Image Analysis 89, 102918 (2023)
- In: CVPR, pp. 8268–8277 (2021)
- In: ECCV (2020)
- In: CVPR, pp. 10213–10224 (2021)
- In: CVPR, pp. 3313–3322 (2019)
- In: ICCV, pp. 16147–16157 (2021)
- In: NeurIPS, vol. 28, pp. 91–99 (2015)
- In: CVPR, pp. 3762–3772 (2022)
- In: CVPR, pp. 6250–6259 (2022)
- In: MICCAI, pp. 234–241. Springer (2015)
- In: ECCV, pp. 195–211. Springer (2022)
- In: ECCV, pp. 746–760. Springer (2012)
- In: CVPR, pp. 5631–5640 (2020)
- In: CVPR, pp. 2573–2582 (2021)
- In: ICML, pp. 6105–6114. PMLR (2019)
- In: ICML, pp. 10096–10106. PMLR (2021)
- In: CVPR, pp. 10781–10790 (2020)
- IEEE Transactions on Image Processing 30, 1116–1129 (2020)
- In: 3DV, pp. 11–20 (2017)
- In: MVA, pp. 1–6 (2019)
- In: ICCV, pp. 16055–16064 (2021)
- IEEE Robotics and Automation Letters 7(4), 11476–11483 (2022)
- In: ICCV, pp. 9422–9432 (2023)
- In: ICCV, pp. 2811–2820 (2019)
- In: ICIP, pp. 913–917. IEEE (2020)
- In: AAAI, vol. 36, pp. 8779–8787 (2022)
- In: ICML, pp. 39099–39109. PMLR (2023)
- In: ECCV, pp. 378–395. Springer (2022)
- IEEE Transactions on Neural Networks and Learning Systems (2022)
- In: ECCV, pp. 214–230. Springer (2022)
- In: AAAI, vol. 37, pp. 3109–3117 (2023)
- arXiv preprint arXiv:2306.14538 (2023)
- arXiv preprint arXiv:2304.11968 (2023)
- In: CVPR, pp. 3353–3362 (2020)
- arXiv preprint arXiv:2309.00828 (2023)
- In: ICCV, pp. 8732–8743 (2023)
- In: ECCV, pp. 818–833. Springer (2014)
- In: CVPR, pp. 7151–7160 (2018)
- arXiv preprint arXiv:2304.13785 (2023)
- In: CVPR, pp. 175–185 (2018)
- In: CVPR, pp. 18527–18536 (2023)
- In: CVPR, pp. 4106–4115 (2019)
- In: CVPR, pp. 2881–2890 (2017)
- IEEE Transactions on Image Processing 30, 5264–5276 (2021)
- In: CVPR, pp. 9233–9242 (2023)
- In: CVPR, pp. 9308–9316 (2019)
- In: 3DV, pp. 690–699. IEEE (2019)