Papers
Topics
Authors
Recent
Search
2000 character limit reached

MoCha-Stereo: Motif Channel Attention Network for Stereo Matching

Published 10 Apr 2024 in cs.CV | (2404.06842v5)

Abstract: Learning-based stereo matching techniques have made significant progress. However, existing methods inevitably lose geometrical structure information during the feature channel generation process, resulting in edge detail mismatches. In this paper, the Motif Cha}nnel Attention Stereo Matching Network (MoCha-Stereo) is designed to address this problem. We provide the Motif Channel Correlation Volume (MCCV) to determine more accurate edge matching costs. MCCV is achieved by projecting motif channels, which capture common geometric structures in feature channels, onto feature maps and cost volumes. In addition, edge variations in %potential feature channels of the reconstruction error map also affect details matching, we propose the Reconstruction Error Motif Penalty (REMP) module to further refine the full-resolution disparity estimation. REMP integrates the frequency information of typical channel features from the reconstruction error. MoCha-Stereo ranks 1st on the KITTI-2015 and KITTI-2012 Reflective leaderboards. Our structure also shows excellent performance in Multi-View Stereo. Code is avaliable at https://github.com/ZYangChen/MoCha-Stereo.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Matrix profile xxii: exact discovery of time series motifs under dtw. In Int. Conf. Data Mining, pages 900–905. IEEE, 2020.
  2. Simple but effective tree structures for dynamic programming-based stereo matching. In VISAPP (2), pages 415–422, 2008.
  3. Stereodrnet: Dilated residual stereonet. In IEEE Conf. Comput. Vis. Pattern Recog., pages 11786–11795, 2019.
  4. Pyramid stereo matching network. In IEEE Conf. Comput. Vis. Pattern Recog., pages 5410–5418, 2018.
  5. Unambiguous pyramid cost volumes fusion for stereo matching. IEEE Trans. Circuit Syst. Video Technol., 2023.
  6. Matrix profile v: A generic technique to incorporate domain knowledge into motif discovery. In ACM Special Interest Group Knowl. Discovery Data Mining, pages 125–134, 2017.
  7. Imagenet: A large-scale hierarchical image database. In IEEE Conf. Comput. Vis. Pattern Recog., pages 248–255. Ieee, 2009.
  8. Efficient belief propagation for early vision. Int. J. Comput. Vis., 70:41–54, 2006.
  9. Are we ready for autonomous driving? the kitti vision benchmark suite. In IEEE Conf. Comput. Vis. Pattern Recog., 2012.
  10. Ross Girshick. Fast r-cnn. In Int. Conf. Comput. Vis., pages 1440–1448, 2015.
  11. Cascade cost volume for high-resolution multi-view stereo and stereo matching. In IEEE Conf. Comput. Vis. Pattern Recog., pages 2495–2504, 2020.
  12. Group-wise correlation stereo network. In IEEE Conf. Comput. Vis. Pattern Recog., pages 3273–3282, 2019.
  13. Heiko Hirschmuller. Stereo processing by semiglobal matching and mutual information. IEEE Trans. Pattern Anal. Mach. Intell., 30(2):328–341, 2007.
  14. Large scale multi-view stereopsis evaluation. In IEEE Conf. Comput. Vis. Pattern Recog., pages 406–413, 2014.
  15. Practical stereo matching via cascaded recurrent network with adaptive correlation. In IEEE Conf. Comput. Vis. Pattern Recog., pages 16263–16272, 2022.
  16. Learning for disparity estimation through feature constancy. In IEEE Conf. Comput. Vis. Pattern Recog., pages 2811–2820, 2018.
  17. Raft-stereo: Multilevel recurrent field transforms for stereo matching. In Int. Conf. 3D Vision, pages 218–227. IEEE, 2021.
  18. Local similarity pattern and cost self-reassembling for deep stereo matching networks. In Assoc. Advancement Artif. Intell., pages 1647–1655, 2022.
  19. Decoupled weight decay regularization. In Int. Conf. Learn. Represent., 2018.
  20. Multiview stereo with cascaded epipolar raft. In Eur. Conf. Comput. Vis., pages 734–750. Springer, 2022.
  21. A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In IEEE Conf. Comput. Vis. Pattern Recog., pages 4040–4048, 2016.
  22. Joint 3d estimation of vehicles and scene flow. ISPRS annals of the photogrammetry, remote sensing and spatial information sciences, 2:427, 2015.
  23. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis., 47:7–42, 2002.
  24. High-resolution stereo datasets with subpixel-accurate ground truth. In Pattern Recog. German Conf., pages 31–42. Springer, 2014.
  25. A multi-view stereo benchmark with high-resolution images and multi-camera videos. In IEEE Conf. Comput. Vis. Pattern Recog., pages 3260–3269, 2017.
  26. Cfnet: Cascade and fused cost volume for robust stereo matching. In IEEE Conf. Comput. Vis. Pattern Recog., pages 13906–13915, 2021.
  27. Pcw-net: Pyramid combination and warping cost volume for stereo matching. In Eur. Conf. Comput. Vis., pages 280–297. Springer, 2022.
  28. Digging into uncertainty-based pseudo-label for robust stereo matching. IEEE Trans. Pattern Anal. Mach. Intell., 2023.
  29. Efficientnet: Rethinking model scaling for convolutional neural networks. In Int. Conf. Mach. Learn., pages 6105–6114. PMLR, 2019.
  30. Hitnet: Hierarchical iterative tile refinement network for real-time stereo matching. In IEEE Conf. Comput. Vis. Pattern Recog., pages 14362–14372, 2021.
  31. Jan L van Hemmen and Reimer Kühn. Nonlinear neural networks. Physical Rev. Lett., 57(7):913, 1986.
  32. Patchmatchnet: Learned multi-view patchmatch stereo. In IEEE Conf. Comput. Vis. Pattern Recog., pages 14194–14203, 2021.
  33. Itermvs: iterative probability estimation for efficient multi-view stereo. In IEEE Conf. Comput. Vis. Pattern Recog., pages 8606–8615, 2022.
  34. Croco v2: Improved cross-view completion pre-training for stereo matching and optical flow. In Int. Conf. Comput. Vis., pages 17969–17980, 2023.
  35. Attention concatenation volume for accurate and efficient stereo matching. In IEEE Conf. Comput. Vis. Pattern Recog., pages 12981–12990, 2022.
  36. Iterative geometry encoding volume for stereo matching. In IEEE Conf. Comput. Vis. Pattern Recog., pages 21919–21928, 2023.
  37. Rethinking disparity: a depth range free multi-view stereo based on disparity. In Assoc. Advancement Artif. Intell., pages 3091–3099, 2023.
  38. Mvsnet: Depth inference for unstructured multi-view stereo. In Eur. Conf. Comput. Vis., pages 767–783, 2018.
  39. Matrix profile i: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets. In Int. Conf. Data Mining, pages 1317–1322. Ieee, 2016.
  40. Computing the stereo matching cost with a convolutional neural network. In IEEE Conf. Comput. Vis. Pattern Recog., pages 1592–1599, 2015.
  41. Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res., 17(1):2287–2318, 2016.
  42. Ga-net: Guided aggregation net for end-to-end stereo matching. In IEEE Conf. Comput. Vis. Pattern Recog., pages 185–194, 2019.
  43. Domain-invariant stereo matching networks. In Eur. Conf. Comput. Vis., pages 420–439. Springer, 2020a.
  44. Vis-mvsnet: Visibility-aware multi-view stereo network. Int. J. Comput. Vis., 131(1):199–214, 2023.
  45. Adaptive unimodal cost volume filtering for deep stereo matching. In Assoc. Advancement Artif. Intell., pages 12926–12934, 2020b.
  46. Eai-stereo: Error aware iterative network for stereo matching. In Asian Conf. Comput. Vision, pages 315–332, 2022.
  47. High-frequency stereo matching network. In IEEE Conf. Comput. Vis. Pattern Recog., pages 1327–1336, 2023.
  48. Miper-mvs: Multi-scale iterative probability estimation with refinement for efficient multi-view stereo. Neural Netw., 162:502–515, 2023.
Citations (10)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 0 likes about this paper.