Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

l-dyno: framework to learn consistent visual features using robot's motion (2310.06249v1)

Published 10 Oct 2023 in cs.RO

Abstract: Historically, feature-based approaches have been used extensively for camera-based robot perception tasks such as localization, mapping, tracking, and others. Several of these approaches also combine other sensors (inertial sensing, for example) to perform combined state estimation. Our work rethinks this approach; we present a representation learning mechanism that identifies visual features that best correspond to robot motion as estimated by an external signal. Specifically, we utilize the robot's transformations through an external signal (inertial sensing, for example) and give attention to image space that is most consistent with the external signal. We use a pairwise consistency metric as a representation to keep the visual features consistent through a sequence with the robot's relative pose transformations. This approach enables us to incorporate information from the robot's perspective instead of solely relying on the image attributes. We evaluate our approach on real-world datasets such as KITTI & EuRoC and compare the refined features with existing feature descriptors. We also evaluate our method using our real robot experiment. We notice an average of 49% reduction in the image search space without compromising the trajectory estimation accuracy. Our method reduces the execution time of visual odometry by 4.3% and also reduces reprojection errors. We demonstrate the need to select only the most important features and show the competitiveness using various feature detection baselines.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. R. Raguram, J.-M. Frahm, and M. Pollefeys, “A comparative analysis of ransac techniques leading to adaptive real-time random sample consensus,” in Computer Vision–ECCV 2008: 10th European Conference on Computer Vision, Marseille, France, October 12-18, 2008, Proceedings, Part II 10.   Springer, 2008, pp. 500–513.
  2. H. Yang, J. Shi, and L. Carlone, “Teaser: Fast and certifiable point cloud registration,” IEEE Transactions on Robotics, vol. 37, no. 2, pp. 314–333, 2020.
  3. W. Jiang, E. Trulls, J. Hosang, A. Tagliasacchi, and K. M. Yi, “Cotr: Correspondence transformer for matching across images,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 6207–6217.
  4. R. Mao, C. Bai, Y. An, F. Zhu, and C. Lu, “3dg-stfm: 3d geometric guided student-teacher feature matching,” in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXVIII.   Springer, 2022, pp. 125–142.
  5. P. Wei, G. Hua, W. Huang, F. Meng, and H. Liu, “Unsupervised monocular visual-inertial odometry network,” in Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, 2021, pp. 2347–2354.
  6. A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” The International Journal of Robotics Research, vol. 32, no. 11, pp. 1231–1237, 2013.
  7. M. Burri, J. Nikolic, P. Gohl, T. Schneider, J. Rehder, S. Omari, M. W. Achtelik, and R. Siegwart, “The euroc micro aerial vehicle datasets,” The International Journal of Robotics Research, vol. 35, no. 10, pp. 1157–1163, 2016.
  8. M. O’Kelly, H. Zheng, D. Karthik, and R. Mangharam, “F1tenth: An open-source evaluation environment for continuous control and reinforcement learning,” Proceedings of Machine Learning Research, vol. 123, 2020.
  9. K. G. Derpanis, “The harris corner detector,” York University, vol. 2, pp. 1–2, 2004.
  10. C. Harris, M. Stephens et al., “A combined corner and edge detector,” in Alvey vision conference, vol. 15, no. 50.   Citeseer, 1988, pp. 10–5244.
  11. T. Carron and P. Lambert, “Color edge detector using jointly hue, saturation and intensity,” in Proceedings of 1st International Conference on Image Processing, vol. 3.   IEEE, 1994, pp. 977–981.
  12. S. Baker and I. Matthews, “Lucas-kanade 20 years on: A unifying framework,” International journal of computer vision, vol. 56, pp. 221–255, 2004.
  13. V. Garcia, E. Debreuve, F. Nielsen, and M. Barlaud, “K-nearest neighbor search: Fast gpu-based implementations and application to high-dimensional feature matching,” in 2010 IEEE International Conference on Image Processing.   IEEE, 2010, pp. 3757–3760.
  14. V. Vijayan and P. Kp, “Flann based matching with sift descriptors for drowsy features extraction,” in 2019 Fifth International Conference on Image Information Processing (ICIIP).   IEEE, 2019, pp. 600–605.
  15. G. Younes, D. Asmar, E. Shammas, and J. Zelek, “Keyframe-based monocular slam: design, survey, and future directions,” Robotics and Autonomous Systems, vol. 98, pp. 67–88, 2017.
  16. R. Mur-Artal, J. M. M. Montiel, and J. D. Tardos, “Orb-slam: a versatile and accurate monocular slam system,” IEEE transactions on robotics, vol. 31, no. 5, pp. 1147–1163, 2015.
  17. J. Engel, T. Schöps, and D. Cremers, “Lsd-slam: Large-scale direct monocular slam,” in Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part II 13.   Springer, 2014, pp. 834–849.
  18. C. Forster, M. Pizzoli, and D. Scaramuzza, “Svo: Fast semi-direct monocular visual odometry,” in 2014 IEEE international conference on robotics and automation (ICRA).   IEEE, 2014, pp. 15–22.
  19. J. Engel, V. Koltun, and D. Cremers, “Direct sparse odometry,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 3, pp. 611–625, 2017.
  20. V. Mohanty, S. Agrawal, S. Datta, A. Ghosh, V. D. Sharma, and D. Chakravarty, “Deepvo: A deep learning approach for monocular visual odometry,” arXiv preprint arXiv:1611.06069, 2016.
  21. W. Wang, Y. Hu, and S. Scherer, “Tartanvo: A generalizable learning-based vo,” in Conference on Robot Learning.   PMLR, 2021, pp. 1761–1772.
  22. R. Gao, X. Xiao, W. Xing, C. Li, and L. Liu, “Unsupervised learning of monocular depth and ego-motion in outdoor/indoor environments,” IEEE Internet of Things Journal, vol. 9, no. 17, pp. 16 247–16 258, 2022.
  23. S. Vijayanarasimhan, S. Ricco, C. Schmid, R. Sukthankar, and K. Fragkiadaki, “Sfm-net: Learning of structure and motion from video,” arXiv preprint arXiv:1704.07804, 2017.
  24. R. Li, S. Wang, Z. Long, and D. Gu, “Undeepvo: Monocular visual odometry through unsupervised deep learning,” in 2018 IEEE international conference on robotics and automation (ICRA).   IEEE, 2018, pp. 7286–7291.
  25. Z. Yin and J. Shi, “Geonet: Unsupervised learning of dense depth, optical flow and camera pose,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 1983–1992.
  26. Y. Wang, P. Wang, Z. Yang, C. Luo, Y. Yang, and W. Xu, “Unos: Unified unsupervised optical-flow and stereo-depth estimation by watching videos,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 8071–8081.
  27. L.-C. Chiu, T.-S. Chang, J.-Y. Chen, and N. Y.-C. Chang, “Fast sift design for real-time visual feature extraction,” IEEE Transactions on Image Processing, vol. 22, no. 8, pp. 3158–3167, 2013.
  28. D. DeTone, T. Malisiewicz, and A. Rabinovich, “Superpoint: Self-supervised interest point detection and description,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2018, pp. 224–236.
  29. D. Sun, X. Yang, M.-Y. Liu, and J. Kautz, “Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 8934–8943.
  30. P.-E. Sarlin, D. DeTone, T. Malisiewicz, and A. Rabinovich, “Superglue: Learning feature matching with graph neural networks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 4938–4947.
  31. J. Sun, Z. Shen, Y. Wang, H. Bao, and X. Zhou, “Loftr: Detector-free local feature matching with transformers,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 8922–8931.
  32. Y. Xie, J. Zhang, C. Shen, and Y. Xia, “Cotr: Efficiently bridging cnn and transformer for 3d medical image segmentation,” in Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part III 24.   Springer, 2021, pp. 171–180.
  33. B. Roessle and M. Nießner, “End2end multi-view feature matching using differentiable pose optimization,” arXiv preprint arXiv:2205.01694, 2022.
  34. F. Alhwarin, D. Ristić-Durrant, and A. Gräser, “Vf-sift: very fast sift feature matching,” in Pattern Recognition: 32nd DAGM Symposium, Darmstadt, Germany, September 22-24, 2010. Proceedings 32.   Springer, 2010, pp. 222–231.
  35. S. Leutenegger, M. Chli, and R. Y. Siegwart, “Brisk: Binary robust invariant scalable keypoints,” in 2011 International conference on computer vision.   Ieee, 2011, pp. 2548–2555.
  36. S. K. Sharma and K. Jain, “Image stitching using akaze features,” Journal of the Indian Society of Remote Sensing, vol. 48, pp. 1389–1401, 2020.
  37. P. F. Alcantarilla, A. Bartoli, and A. J. Davison, “Kaze features,” in Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part VI 12.   Springer, 2012, pp. 214–227.
  38. M. Muja and D. G. Lowe, “Fast matching of binary features,” in 2012 Ninth conference on computer and robot vision.   IEEE, 2012, pp. 404–410.
  39. Z. Zhang, “Determining the epipolar geometry and its uncertainty: A review,” International journal of computer vision, vol. 27, pp. 161–195, 1998.
  40. D. Muhle, L. Koestler, K. M. Jatavallabhula, and D. Cremers, “Learning correspondence uncertainty via differentiable nonlinear least squares,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 13 102–13 112.
  41. L. Freda, “pySLAM V2.” [Online]. Available: https://github.com/luigifreda/pyslam
Citations (1)

Summary

We haven't generated a summary for this paper yet.