Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

BAMF-SLAM: Bundle Adjusted Multi-Fisheye Visual-Inertial SLAM Using Recurrent Field Transforms (2306.01173v2)

Published 1 Jun 2023 in cs.RO

Abstract: In this paper, we present BAMF-SLAM, a novel multi-fisheye visual-inertial SLAM system that utilizes Bundle Adjustment (BA) and recurrent field transforms (RFT) to achieve accurate and robust state estimation in challenging scenarios. First, our system directly operates on raw fisheye images, enabling us to fully exploit the wide Field-of-View (FoV) of fisheye cameras. Second, to overcome the low-texture challenge, we explore the tightly-coupled integration of multi-camera inputs and complementary inertial measurements via a unified factor graph and jointly optimize the poses and dense depth maps. Third, for global consistency, the wide FoV of the fisheye camera allows the system to find more potential loop closures, and powered by the broad convergence basin of RFT, our system can perform very wide baseline loop closing with little overlap. Furthermore, we introduce a semi-pose-graph BA method to avoid the expensive full global BA. By combining relative pose factors with loop closure factors, the global states can be adjusted efficiently with modest memory footprint while maintaining high accuracy. Evaluations on TUM-VI, Hilti-Oxford and Newer College datasets show the superior performance of the proposed system over prior works. In the Hilti SLAM Challenge 2022, our VIO version achieves second place. In a subsequent submission, our complete system, including the global BA backend, outperforms the winning approach.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. S. Leutenegger, S. Lynen, M. Bosse, R. Siegwart, and P. Furgale, “Keyframe-based visual-inertial odometry using nonlinear optimization,” The International Journal of Robotics Research, vol. 34, no. 3, pp. 314–334, 2015.
  2. T. Qin, P. Li, and S. Shen, “Vins-mono: A robust and versatile monocular visual-inertial state estimator,” IEEE Transactions on Robotics, vol. 34, no. 4, pp. 1004–1020, 2018.
  3. C. Campos, R. Elvira, J. J. G. Rodríguez, J. M. Montiel, and J. D. Tardós, “ORB-SLAM3: An accurate open-source library for visual, visual-inertial, and multimap slam,” IEEE Transactions on Robotics, vol. 37, no. 6, pp. 1874–1890, 2021.
  4. S. Leutenegger, “OKVIS2: Realtime scalable visual-inertial slam with loop closure,” arXiv preprint arXiv:2202.09199, 2022.
  5. D. Schubert, T. Goll, N. Demmel, V. Usenko, J. Stückler, and D. Cremers, “The TUM VI benchmark for evaluating visual-inertial odometry,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2018, pp. 1680–1687.
  6. L. Zhang, M. Helmberger, L. F. T. Fu, D. Wisth, M. Camurri, D. Scaramuzza, and M. Fallon, “Hilti-oxford dataset: A millimeter-accurate benchmark for simultaneous localization and mapping,” IEEE Robotics and Automation Letters, vol. 8, no. 1, pp. 408–415, 2022.
  7. B. Triggs, P. F. McLauchlan, R. I. Hartley, and A. W. Fitzgibbon, “Bundle adjustment—a modern synthesis,” in Vision Algorithms: Theory and Practice: International Workshop on Vision Algorithms Corfu, Greece, September 21–22, 1999 Proceedings.   Springer, 2000.
  8. J. Engel, T. Schöps, and D. Cremers, “LSD-SLAM: Large-scale direct monocular SLAM,” in European conference on computer vision.   Springer, 2014, pp. 834–849.
  9. J. Engel, V. Koltun, and D. Cremers, “Direct sparse odometry,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 3, pp. 611–625, 2017.
  10. Z. Teed and J. Deng, “DROID-SLAM: Deep visual SLAM for monocular, stereo, and RGB-D cameras,” Advances in Neural Information Processing Systems, vol. 34, pp. 16 558–16 569, 2021.
  11. W. Zhang, S. Wang, and N. Haala, “Towards robust indoor visual SLAM and dense reconstruction for mobile robots,” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. 1, pp. 211–219, 2022.
  12. P. Ji, Q. Yan, Y. Ma, and Y. Xu, “Georefine: Self-supervised online depth refinement for accurate dense mapping,” in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part I.   Springer, 2022, pp. 360–377.
  13. C. Forster, L. Carlone, F. Dellaert, and D. Scaramuzza, “IMU preintegration on manifold for efficient visual-inertial maximum-a-posteriori estimation.”   Georgia Institute of Technology, 2015.
  14. R. Mur-Artal and J. D. Tardós, “Visual-inertial monocular SLAM with map reuse,” IEEE Robotics and Automation Letters, vol. 2, no. 2, pp. 796–803, 2017.
  15. J. Kuo, M. Muglikar, Z. Zhang, and D. Scaramuzza, “Redesigning SLAM for arbitrary multi-camera systems,” in 2020 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2020, pp. 2116–2122.
  16. S. Urban, S. Wursthorn, J. Leitloff, and S. Hinz, “MultiCol bundle adjustment: a generic method for pose estimation, simultaneous self-calibration and reconstruction for arbitrary multi-camera systems,” International Journal of Computer Vision, vol. 121, no. 2, pp. 234–252, 2017.
  17. R. Mur-Artal and J. D. Tardós, “ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras,” IEEE transactions on robotics, vol. 33, no. 5, pp. 1255–1262, 2017.
  18. S. Ji, Z. Qin, J. Shan, and M. Lu, “Panoramic SLAM from a multiple fisheye camera rig,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 159, pp. 169–183, 2020.
  19. L. Zhang, D. Wisth, M. Camurri, and M. Fallon, “Balancing the budget: Feature selection and tracking for multi-camera visual-inertial odometry,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 1182–1189, 2021.
  20. Z. Teed and J. Deng, “RAFT: Recurrent all-pairs field transforms for optical flow,” in European conference on computer vision.   Springer, 2020, pp. 402–419.
  21. Z. Teed and J. Deng, “RAFT-3D: Scene flow using rigid-motion embeddings,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 8375–8384.
  22. L. Lipson, Z. Teed, and J. Deng, “RAFT-Stereo: Multilevel recurrent field transforms for stereo matching,” in 2021 International Conference on 3D Vision (3DV).   IEEE, 2021, pp. 218–227.
  23. W. Wang, D. Zhu, X. Wang, Y. Hu, Y. Qiu, C. Wang, Y. Hu, A. Kapoor, and S. Scherer, “TartanAir: A dataset to push the limits of visual SLAM,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2020, pp. 4909–4916.
  24. J. Kannala and S. S. Brandt, “A generic camera model and calibration method for conventional, wide-angle, and fish-eye lenses,” IEEE transactions on pattern analysis and machine intelligence, vol. 28, no. 8, pp. 1335–1340, 2006.
  25. W. Niemeier, “Ausgleichungsrechnung,” in Ausgleichungsrechnung.   de Gruyter, 2008.
  26. L. Zhang, M. Camurri, and M. Fallon, “Multi-camera LiDAR inertial extension to the newer college dataset,” arXiv preprint arXiv:2112.08854, 2021.
  27. V. Usenko, N. Demmel, D. Schubert, J. Stückler, and D. Cremers, “Visual-inertial mapping with non-linear factor recovery,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 422–429, 2019.
  28. P. Geneva, K. Eckenhoff, W. Lee, Y. Yang, and G. Huang, “OpenVINS: A research platform for visual-inertial estimation,” in 2020 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2020, pp. 4666–4672.
Citations (8)

Summary

We haven't generated a summary for this paper yet.