Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Digging Into Normal Incorporated Stereo Matching (2402.18171v1)

Published 28 Feb 2024 in cs.CV

Abstract: Despite the remarkable progress facilitated by learning-based stereo-matching algorithms, disparity estimation in low-texture, occluded, and bordered regions still remains a bottleneck that limits the performance. To tackle these challenges, geometric guidance like plane information is necessary as it provides intuitive guidance about disparity consistency and affinity similarity. In this paper, we propose a normal incorporated joint learning framework consisting of two specific modules named non-local disparity propagation(NDP) and affinity-aware residual learning(ARL). The estimated normal map is first utilized for calculating a non-local affinity matrix and a non-local offset to perform spatial propagation at the disparity level. To enhance geometric consistency, especially in low-texture regions, the estimated normal map is then leveraged to calculate a local affinity matrix, providing the residual learning with information about where the correction should refer and thus improving the residual learning efficiency. Extensive experiments on several public datasets including Scene Flow, KITTI 2015, and Middlebury 2014 validate the effectiveness of our proposed method. By the time we finished this work, our approach ranked 1st for stereo matching across foreground pixels on the KITTI 2015 dataset and 3rd on the Scene Flow dataset among all the published works.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Bi3D: Stereo Depth Estimation via Binary Classifications. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1597–1605.
  2. J. Chang and Y. Chen. 2018. Pyramid Stereo Matching Network. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5410–5418.
  3. Hierarchical Neural Architecture Search for Deep Stereo Matching. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 22158–22169. https://proceedings.neurips.cc/paper/2020/file/fc146be0b230d7e0a92e66a6114b840d-Paper.pdf
  4. FlowNet: Learning Optical Flow with Convolutional Networks. In 2015 IEEE International Conference on Computer Vision (ICCV). 2758–2766.
  5. Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2492–2501. https://doi.org/10.1109/CVPR42600.2020.00257
  6. Group-Wise Correlation Stereo Network. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 3268–3277.
  7. End-to-End Learning of Geometry and Context for Deep Stereo Regression. In 2017 IEEE International Conference on Computer Vision (ICCV). 66–75.
  8. StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction. In Proceedings of the European Conference on Computer Vision (ECCV).
  9. Normal Assisted Stereo Depth Estimation. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2186–2196. https://doi.org/10.1109/CVPR42600.2020.00226
  10. Learning for Disparity Estimation Through Feature Constancy. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2811–2820.
  11. Adaptive Surface Normal Constraint for Depth Estimation. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). 12829–12838. https://doi.org/10.1109/ICCV48922.2021.01261
  12. A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4040–4048.
  13. Moritz Menze and Andreas Geiger. 2015. Object scene flow for autonomous vehicles. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3061–3070. https://doi.org/10.1109/CVPR.2015.7298925
  14. Christoph Rhemann Michael Bleyer and Carsten Rother. 2011. PatchMatch Stereo - Stereo Matching with Slanted Support Windows. In Proceedings of the British Machine Vision Conference. 14.1–14.11. http://dx.doi.org/10.5244/C.25.14.
  15. Non-local Spatial Propagation Network for Depth Completion. In Computer Vision – ECCV 2020, Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer International Publishing, Cham, 120–136.
  16. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32. Curran Associates, Inc., 8026–8037. https://proceedings.neurips.cc/paper/2019/file/bdbca288fee7f92f2bfa9f7012727740-Paper.pdf
  17. GeoNet: Geometric Neural Network for Joint Depth and Surface Normal Estimation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 283–291. https://doi.org/10.1109/CVPR.2018.00037
  18. Joint Graph-Based Depth Refinement and Normal Estimation. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 12151–12160. https://doi.org/10.1109/CVPR42600.2020.01217
  19. High-resolution stereo datasets with subpixel-accurate ground truth.. In German Conference on Pattern Recognition (GCPR).
  20. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. In Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).
  21. CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 13901–13910. https://doi.org/10.1109/CVPR46437.2021.01369
  22. HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 14357–14367. https://doi.org/10.1109/CVPR46437.2021.01413
  23. PatchmatchNet: Learned Multi-View Patchmatch Stereo. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 14194–14203.
  24. FADNet: A Fast and Accurate Network for Disparity Estimation. In 2020 IEEE International Conference on Robotics and Automation (ICRA 2020). 101–107.
  25. CSPN: Multi-Scale Cascade Spatial Pyramid Network for Object Detection. In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 1490–1494. https://doi.org/10.1109/ICASSP39728.2021.9414883
  26. Designing deep networks for surface normal estimation. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 539–547. https://doi.org/10.1109/CVPR.2015.7298652
  27. ACVNet: Attention Concatenation Volume for Accurate and Efficient Stereo Matching. arXiv e-prints, Article arXiv:2203.02146 (March 2022), arXiv:2203.02146 pages. arXiv:2203.02146 [cs.CV]
  28. H. Xu and J. Zhang. 2020. AANet: Adaptive Aggregation Network for Efficient Stereo Matching. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1956–1965.
  29. Deformable Spatial Propagation Networks For Depth Completion. In 2020 IEEE International Conference on Image Processing (ICIP). 913–917. https://doi.org/10.1109/ICIP40778.2020.9191138
  30. SegStereo: Exploiting Semantic Information for Disparity Estimation. In Proceedings of the European Conference on Computer Vision (ECCV).
  31. Enforcing Geometric Constraints of Virtual Normal for Depth Prediction. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 5683–5692. https://doi.org/10.1109/ICCV.2019.00578
  32. Hierarchical Discrete Distribution Decomposition for Match Density Estimation. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 6037–6046.
  33. GA-Net: Guided Aggregation Net for End-To-End Stereo Matching. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  34. EDNet: Efficient Disparity Estimation with Cost Volume Combination and Attention-based Spatial Residual. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5429–5438. https://doi.org/10.1109/CVPR46437.2021.00539
  35. Pattern-Affinitive Propagation Across Depth, Surface Normal and Semantic Segmentation. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 4101–4110. https://doi.org/10.1109/CVPR.2019.00423
  36. A Confidence-based Iterative Solver of Depths and Surface Normals for Deep Multi-view Stereo. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). 6148–6157. https://doi.org/10.1109/ICCV48922.2021.00611
Citations (4)

Summary

We haven't generated a summary for this paper yet.