Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 175 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 38 tok/s Pro
GPT-5 High 37 tok/s Pro
GPT-4o 108 tok/s Pro
Kimi K2 180 tok/s Pro
GPT OSS 120B 447 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

RomniStereo: Recurrent Omnidirectional Stereo Matching (2401.04345v2)

Published 9 Jan 2024 in cs.CV

Abstract: Omnidirectional stereo matching (OSM) is an essential and reliable means for $360{\circ}$ depth sensing. However, following earlier works on conventional stereo matching, prior state-of-the-art (SOTA) methods rely on a 3D encoder-decoder block to regularize the cost volume, causing the whole system complicated and sub-optimal results. Recently, the Recurrent All-pairs Field Transforms (RAFT) based approach employs the recurrent update in 2D and has efficiently improved image-matching tasks, ie, optical flow, and stereo matching. To bridge the gap between OSM and RAFT, we mainly propose an opposite adaptive weighting scheme to seamlessly transform the outputs of spherical sweeping of OSM into the required inputs for the recurrent update, thus creating a recurrent omnidirectional stereo matching (RomniStereo) algorithm. Furthermore, we introduce two techniques, ie, grid embedding and adaptive context feature generation, which also contribute to RomniStereo's performance. Our best model improves the average MAE metric by 40.7\% over the previous SOTA baseline across five datasets. When visualizing the results, our models demonstrate clear advantages on both synthetic and realistic examples. The code is available at \url{https://github.com/HalleyJiang/RomniStereo}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. G. P. de La Garanderie, A. A. Abarghouei, and T. P. Breckon, “Eliminating the blind spot: Adapting 3d object detection and monocular depth estimation to 360 panoramic imagery,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 789–807.
  2. Y. Liao, J. Xie, and A. Geiger, “Kitti-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 3, pp. 3292–3310, 2022.
  3. T. L. da Silveira, P. G. Pinto, J. Murrugarra-Llerena, and C. R. Jung, “3d scene geometry estimation from 360 imagery: A survey,” ACM Computing Surveys, vol. 55, no. 4, pp. 1–39, 2022.
  4. N. Zioulis, A. Karakottas, D. Zarpalas, and P. Daras, “Omnidepth: Dense depth estimation for indoors spherical panoramas,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018.
  5. F.-E. Wang, Y.-H. Yeh, M. Sun, W.-C. Chiu, and Y.-H. Tsai, “Bifuse: Monocular 360 depth estimation via bi-projection fusion,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 462–471.
  6. H. Jiang, Z. Sheng, S. Zhu, Z. Dong, and R. Huang, “Unifuse: Unidirectional fusion for 360∘{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT panorama depth estimation,” IEEE Robotics and Automation Letters, 2021.
  7. Y. Li, Y. Guo, Z. Yan, X. Huang, Y. Duan, and L. Ren, “Omnifusion: 360 monocular depth estimation via geometry-aware fusion,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 2801–2810.
  8. W. Gao and S. Shen, “Dual-fisheye omnidirectional stereo,” in Intelligent Robots and Systems (IROS), 2017 IEEE/RSJ International Conference on.   IEEE, 2017, pp. 6715–6722.
  9. N.-H. Wang, B. S. andYi Hsuan Tsai, W.-C. Chiu, and M. Sun, “360sd-net: 360∘superscript360360^{\circ}360 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT stereo depth estimation with learnable cost volume,” in International Conference on Robotics and Automation (ICRA), 2020.
  10. C. Won, J. Ryu, and J. Lim, “Sweepnet: Wide-baseline omnidirectional depth estimation,” in 2019 International Conference on Robotics and Automation (ICRA).   IEEE, 2019, pp. 6073–6079.
  11. ——, “Omnimvs: End-to-end learning for omnidirectional stereo matching,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 8987–8996.
  12. ——, “End-to-end learning for omnidirectional stereo matching with uncertainty prior,” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2020.
  13. Z. Chen, C. Lin, L. Nie, Z. Shen, K. Liao, Y. Cao, and Y. Zhao, “S-omnimvs: Incorporating sphere geometry into omnidirectional stereo matching,” in Proceedings of the 31st ACM International Conference on Multimedia, 2023, pp. 1495–1503.
  14. A. Kendall, H. Martirosyan, S. Dasgupta, P. Henry, R. Kennedy, A. Bachrach, and A. Bry, “End-to-end learning of geometry and context for deep stereo regression,” in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 66–75.
  15. J.-R. Chang and Y.-S. Chen, “Pyramid stereo matching network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 5410–5418.
  16. Z. Teed and J. Deng, “Raft: Recurrent all-pairs field transforms for optical flow,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16.   Springer, 2020, pp. 402–419.
  17. X. Sui, S. Li, X. Geng, Y. Wu, X. Xu, Y. Liu, R. Goh, and H. Zhu, “Craft: Cross-attentional flow transformer for robust optical flow,” in Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, 2022, pp. 17 602–17 611.
  18. L. Lipson, Z. Teed, and J. Deng, “Raft-stereo: Multilevel recurrent field transforms for stereo matching,” in 2021 International Conference on 3D Vision (3DV).   IEEE, 2021, pp. 218–227.
  19. J. Li, P. Wang, P. Xiong, T. Cai, Z. Yan, L. Yang, J. Liu, H. Fan, and S. Liu, “Practical stereo matching via cascaded recurrent network with adaptive correlation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 16 263–16 272.
  20. J. Jing, J. Li, P. Xiong, J. Liu, S. Liu, Y. Guo, X. Deng, M. Xu, L. Jiang, and L. Sigal, “Uncertainty guided adaptive warping for robust and efficient stereo matching,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 3318–3327.
  21. H. Hirschmuller, “Stereo processing by semiglobal matching and mutual information,” IEEE Transactions on pattern analysis and machine intelligence, vol. 30, no. 2, pp. 328–341, 2008.
  22. R. Komatsu, H. Fujii, Y. Tamura, A. Yamashita, and H. Asama, “360 depth estimation from multiple fisheye images with origami crown representation of icosahedron,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2020, pp. 10 092–10 099.
  23. A. Meuleman, H. Jang, D. S. Jeon, and M. H. Kim, “Real-time sphere sweeping stereo from multiview fisheye images,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 11 423–11 432.
  24. N. Mayer, E. Ilg, P. Hausser, P. Fischer, D. Cremers, A. Dosovitskiy, and T. Brox, “A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4040–4048.
  25. H. Jiang, R. Xu, and W. Jiang, “An improved raftstereo trained with a mixed dataset for the robust vision challenge 2022,” arXiv preprint arXiv:2210.12785, 2022.
  26. I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” in International Conference on Learning Representations, 2018.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.