Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Grid-based Fast and Structural Visual Odometry (2403.01110v1)

Published 2 Mar 2024 in cs.RO

Abstract: In the field of Simultaneous Localization and Mapping (SLAM), researchers have always pursued better performance in terms of accuracy and time cost. Traditional algorithms typically rely on fundamental geometric elements in images to establish connections between frames. However, these elements suffer from disadvantages such as uneven distribution and slow extraction. In addition, geometry elements like lines have not been fully utilized in the process of pose estimation. To address these challenges, we propose GFS-VO, a grid-based RGB-D visual odometry algorithm that maximizes the utilization of both point and line features. Our algorithm incorporates fast line extraction and a stable line homogenization scheme to improve feature processing. To fully leverage hidden elements in the scene, we introduce Manhattan Axes (MA) to provide constraints between local map and current frame. Additionally, we have designed an algorithm based on breadth-first search for extracting plane normal vectors. To evaluate the performance of GFS-VO, we conducted extensive experiments. The results demonstrate that our proposed algorithm exhibits significant improvements in both time cost and accuracy compared to existing approaches.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. R. G. Von Gioi, J. Jakubowicz, J.-M. Morel, and G. Randall, “Lsd: A line segment detector,” Image Processing On Line, vol. 2, pp. 35–55, 2012.
  2. R. Mur-Artal, J. M. M. Montiel, and J. D. Tardos, “Orb-slam: a versatile and accurate monocular slam system,” IEEE transactions on robotics, vol. 31, no. 5, pp. 1147–1163, 2015.
  3. R. Mur-Artal and J. D. Tardós, “Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras,” IEEE transactions on robotics, vol. 33, no. 5, pp. 1255–1262, 2017.
  4. C. Campos, R. Elvira, J. J. G. Rodríguez, J. M. Montiel, and J. D. Tardós, “Orb-slam3: An accurate open-source library for visual, visual–inertial, and multimap slam,” IEEE Transactions on Robotics, vol. 37, no. 6, pp. 1874–1890, 2021.
  5. T. Qin, P. Li, and S. Shen, “Vins-mono: A robust and versatile monocular visual-inertial state estimator,” IEEE Transactions on Robotics, vol. 34, no. 4, pp. 1004–1020, 2018.
  6. A. Pumarola, A. Vakhitov, A. Agudo, A. Sanfeliu, and F. Moreno-Noguer, “Pl-slam: Real-time monocular visual slam with points and lines,” in 2017 IEEE international conference on robotics and automation (ICRA).   IEEE, 2017, pp. 4503–4508.
  7. Q. Fu, J. Wang, H. Yu, I. Ali, F. Guo, Y. He, and H. Zhang, “Pl-vins: Real-time monocular visual-inertial slam with point and line features,” arXiv preprint arXiv:2009.07462, 2020.
  8. Y. He, J. Zhao, Y. Guo, W. He, and K. Yuan, “Pl-vio: Tightly-coupled monocular visual–inertial odometry using point and line features,” Sensors, vol. 18, no. 4, p. 1159, 2018.
  9. C. Akinlar and C. Topal, “Edlines: A real-time line segment detector with a false detection control,” Pattern Recognition Letters, vol. 32, no. 13, pp. 1633–1642, 2011.
  10. I. Suárez, J. M. Buenaposada, and L. Baumela, “Elsed: Enhanced line segment drawing,” Pattern Recognition, vol. 127, p. 108619, 2022.
  11. L. Zhou, G. Huang, Y. Mao, S. Wang, and M. Kaess, “Edplvo: Efficient direct point-line visual odometry,” in 2022 International Conference on Robotics and Automation (ICRA).   IEEE, 2022, pp. 7559–7565.
  12. L. Zhou, S. Wang, and M. Kaess, “Dplvo: Direct point-line monocular visual odometry,” IEEE Robotics and Automation Letters, vol. 6, no. 4, pp. 7113–7120, 2021.
  13. Q. Wang, Z. Yan, J. Wang, F. Xue, W. Ma, and H. Zha, “Line flow based simultaneous localization and mapping,” IEEE Transactions on Robotics, vol. 37, no. 5, pp. 1416–1432, 2021.
  14. R. Gomez-Ojeda, J. Briales, and J. Gonzalez-Jimenez, “Pl-svo: Semi-direct monocular visual odometry by combining points and line segments,” in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016, pp. 4211–4216.
  15. C. Forster, M. Pizzoli, and D. Scaramuzza, “Svo: Fast semi-direct monocular visual odometry,” in 2014 IEEE International Conference on Robotics and Automation (ICRA), 2014, pp. 15–22.
  16. Y. Li, R. Yunus, N. Brasch, N. Navab, and F. Tombari, “Rgb-d slam with structural regularities,” in 2021 IEEE international conference on Robotics and automation (ICRA).   IEEE, 2021, pp. 11 581–11 587.
  17. Y. Li, N. Brasch, Y. Wang, N. Navab, and F. Tombari, “Structure-slam: Low-drift monocular slam in indoor environments,” IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 6583–6590, 2020.
  18. P. Kim, B. Coltin, and H. J. Kim, “Low-drift visual odometry in structured environments by decoupling rotational and translational motion,” in 2018 IEEE international conference on Robotics and automation (ICRA).   IEEE, 2018, pp. 7247–7253.
  19. J. P. Company-Corcoles, E. Garcia-Fidalgo, and A. Ortiz, “Msc-vo: Exploiting manhattan and structural constraints for visual odometry,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 2803–2810, 2022.
  20. J. Straub, G. Rosman, O. Freifeld, J. J. Leonard, and J. W. Fisher, “A mixture of manhattan frames: Beyond the manhattan world,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 3770–3777.
  21. R. Yunus, Y. Li, and F. Tombari, “Manhattanslam: Robust planar tracking and mapping leveraging mixture of manhattan frames,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 6687–6693.
  22. J. M. Coughlan and A. L. Yuille, “Manhattan world: Compass direction from a single image by bayesian inference,” in Proceedings of the seventh IEEE international conference on computer vision, vol. 2.   IEEE, 1999, pp. 941–947.
  23. D. Holz, S. Holzer, R. B. Rusu, and S. Behnke, “Real-time plane segmentation using rgb-d cameras,” in RoboCup 2011: Robot Soccer World Cup XV 15.   Springer, 2012, pp. 306–317.
  24. E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “Orb: An efficient alternative to sift or surf,” in 2011 International conference on computer vision.   Ieee, 2011, pp. 2564–2571.
  25. L. Zhang and R. Koch, “An efficient and robust line segment matching approach based on lbd descriptor and pairwise geometric consistency,” Journal of visual communication and image representation, vol. 24, no. 7, pp. 794–805, 2013.
  26. R. K”̈ummerle, G. Grisetti, H. Strasdat, K. Konolige, and W. Burgard, “g 2 o: A general framework for graph optimization,” in 2011 IEEE International Conference on Robotics and Automation.   IEEE, 2011, pp. 3607–3613.
  27. J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers, “A benchmark for the evaluation of rgb-d slam systems,” in 2012 IEEE/RSJ international conference on intelligent robots and systems.   IEEE, 2012, pp. 573–580.
  28. A. Handa, T. Whelan, J. McDonald, and A. J. Davison, “A benchmark for rgb-d visual odometry, 3d reconstruction and slam,” in 2014 IEEE international conference on Robotics and automation (ICRA).   IEEE, 2014, pp. 1524–1531.
  29. P. Kim, B. Coltin, and H. J. Kim, “Low-drift visual odometry in structured environments by decoupling rotational and translational motion,” in 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 7247–7253.
  30. X. Zhang, W. Wang, X. Qi, Z. Liao, and R. Wei, “Point-plane slam using supposed planes for indoor environments,” Sensors, vol. 19, no. 17, p. 3795, 2019.
  31. V. A. Prisacariu, O. K”̈ahler, S. Golodetz, M. Sapienza, T. Cavallari, P. H. Torr, and D. W. Murray, “Infinitam v3: A framework for large-scale 3d reconstruction with loop closure,” arXiv preprint arXiv:1708.00783, 2017.

Summary

  • The paper introduces GFS-VO, a grid-based visual odometry system that leverages both point and line features for enhanced SLAM performance.
  • It employs innovative line homogenization techniques and Manhattan Axes extraction to reduce computation time and improve pose estimation accuracy.
  • Extensive experiments demonstrate that GFS-VO outperforms traditional methods in challenging real-world scenarios.

Enhancing Visual Odometry with GFS-VO: A Grid-based Approach Leveraging Line Features

Overview of GFS-VO

GFS-VO (Grid-based Fast and Structural Visual Odometry) emerges as a novel solution in Simultaneous Localization and Mapping (SLAM) by maximizing the utilization of both point and line features in RGB-D visual odometry. Traditional algorithms often rely on point features for frame-to-frame connections, but these can be susceptible to environmental factors such as lighting variations and occlusions. GFS-VO addresses these and other challenges through innovative approaches in line feature extraction, homogenization, and the incorporation of Manhattan Axes (MA) for enhanced pose estimation accuracy and reduced time costs.

Feature Extraction and Homogenization

GFS-VO distinguishes itself with its approach to feature extraction, applying separate procedures for geometry and spatial features. For geometry features, ORB and EDLine algorithms are employed for point and line detection, respectively. Notably, line homogenization is achieved through three strategies: Quadtree-based, Midpoint-Quadtree-based, and a novel Score-based scheme. These methods aim to ensure an even distribution and efficient utilization of line features, crucial for reducing computational burden and improving system performance.

The incorporation of plane features, primarily for MA extraction, further sets GFS-VO apart. A Breadth-First Search (BFS) based algorithm is introduced for efficient extraction of plane normal vectors, which are pivotal for accurate MA determination. This method allows for rapid feature extraction without sacrificing accuracy, addressing a significant challenge in visual odometry frameworks that rely on structural assumptions.

Pose Estimation and Optimization

GFS-VO's framework extends its innovative approach to pose estimation and optimization. Grid-based tracking, grounded on the earlier mentioned grid structure and line homogenization methods, enables efficient feature matching by narrowing down candidate matches through geometric positioning. This approach not only reduces computational time but also enhances matching accuracy, especially in conditions where estimated speed varies.

Pose optimization benefits from the systematic incorporation of line and point features, alongside structural constraints provided by MAs and the relationships between line segments. These elements are cohesively integrated to refine pose estimation, leveraging the unique strengths of each feature type and structural assumption within the optimization process.

Experimental Validation

Extensive experiments validate GFS-VO's performance, demonstrating significant improvements in both accuracy and time efficiency compared to existing methods. The novel line homogenization strategies, in particular, yield promising results in handling challenges posed by dense areas within visual scenes. When compared across several SLAM datasets, GFS-VO consistently exhibits superior performance, especially in real-world scenarios where the complexity of environmental features underscores the benefits of the proposed method.

Future Directions and Conclusion

Despite its advancements, GFS-VO's exploration into line homogenization and the interaction between different feature types presents avenues for further refinement. The adaptive setting of homogenization thresholds and exploring the impacts of feature position relationships on visual odometry accuracy are identified as potential areas for future research.

GFS-VO stands as a significant contribution to the field of SLAM, offering a sophisticated framework that adeptly harnesses the strengths of both point and line features through a grid-based approach. Its innovative methods for feature extraction, homogenization, and pose optimization, coupled with the strategic incorporation of Manhattan Axes, establish GFS-VO as a robust and efficient solution for visual odometry challenges.