Opti-Acoustic Semantic SLAM with Unknown Objects in Underwater Environments (2403.12837v2)
Abstract: Despite recent advances in semantic Simultaneous Localization and Mapping (SLAM) for terrestrial and aerial applications, underwater semantic SLAM remains an open and largely unaddressed research problem due to the unique sensing modalities and the object classes found underwater. This paper presents an object-based semantic SLAM method for underwater environments that can identify, localize, classify, and map a wide variety of marine objects without a priori knowledge of the object classes present in the scene. The method performs unsupervised object segmentation and object-level feature aggregation, and then uses opti-acoustic sensor fusion for object localization. Probabilistic data association is used to determine observation to landmark correspondences. Given such correspondences, the method then jointly optimizes landmark and vehicle position estimates. Indoor and outdoor underwater datasets with a wide variety of objects and challenging acoustic and lighting conditions are collected for evaluation and made publicly available. Quantitative and qualitative results show the proposed method achieves reduced trajectory error compared to baseline methods, and is able to obtain comparable map accuracy to a baseline closed-set method that requires hand-labeled data of all objects in the scene.
- A. Rosinol, A. Violette, M. Abate, N. Hughes, Y. Chang, J. Shi, A. Gupta, and L. Carlone, “Kimera: From SLAM to Spatial Perception with 3D Dynamic Scene Graphs,” in International Journal of Robotics Research, vol. 40, 2021, pp. 1510–1546.
- L. N. Nicholson, M. Milford, and N. Sünderhauf, “QuadricSLAM: Dual Quadrics as SLAM Landmarks,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018, pp. 426–4261.
- L. Schmid, M. Abate, Y. Chang, and L. Carlone, “Khronos: A Unified Approach for Spatio-Temporal Metric-Semantic SLAM in Dynamic Environments,” arXiv preprint arXiv:2402.13817, 2024.
- S. Jamieson, K. Fathian, K. Khosoussi, J. P. How, and Y. Girdhar, “Multi-Robot Distributed Semantic Mapping in Unfamiliar Environments Through Online Matching of Learned Representations,” in 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 8587–8593.
- B. O’Neill, “Adaptive AUV-assisted Diver Navigation for Loosely-Coupled Teaming in Undersea Operations,” Ph.D. dissertation, Massachusetts Institute of Technology, 2023.
- P. V. Teixeira, M. Kaess, F. S. Hover, and J. J. Leonard, “Underwater Inspection Using Sonar-based Volumetric Submaps,” IEEE International Conference on Intelligent Robots and Systems, vol. 2016-Novem, pp. 4288–4295, 2016.
- M. A. Lanthier, D. Nussbaum, and A. Sheng, “Improving Vision-Based Maps by Using Sonar and Infrared Data,” Proceedings of the IASTED International Conference on Robotics and Applications, pp. 118–123, 2004.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u. Kaiser, and I. Polosukhin, “Attention is All You Need,” in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30. Curran Associates, Inc., 2017.
- A. Kolesnikov, A. Dosovitskiy, D. Weissenborn, G. Heigold, J. Uszkoreit, L. Beyer, M. Minderer, M. Dehghani, N. Houlsby, S. Gelly, T. Unterthiner, and X. Zhai, “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” 2021.
- M. Caron, H. Touvron, I. Misra, H. Jégou, J. Mairal, P. Bojanowski, and A. Joulin, “Emerging Properties in Self-Supervised Vision Transformers,” in Proceedings of the International Conference on Computer Vision (ICCV), 2021.
- A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, et al., “Learning Transferable Visual Models from Natural Language Supervision,” in International Conference on Machine Learning. PMLR, 2021, pp. 8748–8763.
- A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo, P. Dollár, and R. Girshick, “Segment Anything,” arXiv:2304.02643, 2023.
- T.-W. Ke, J.-J. Hwang, Y. Guo, X. Wang, and S. X. Yu, “Unsupervised Hierarchical Semantic Segmentation with Multiview Cosegmentation and Clustering Transformers,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 2571–2581.
- D. Niu, X. Wang, X. Han, L. Lian, R. Herzig, and T. Darrell, “Unsupervised Universal Image Segmentation,” 2023.
- X. Wang, R. Girdhar, S. X. Yu, and I. Misra, “Cut and Learn for Unsupervised Object Detection and Instance Segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 3124–3134.
- T. Ren, S. Liu, A. Zeng, J. Lin, K. Li, H. Cao, J. Chen, X. Huang, Y. Chen, F. Yan, Z. Zeng, H. Zhang, F. Li, J. Yang, H. Li, Q. Jiang, and L. Zhang, “Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks,” 2024.
- F. Li, H. Zhang, P. Sun, X. Zou, S. Liu, J. Yang, C. Li, L. Zhang, and J. Gao, “Semantic-SAM: Segment and Recognize Anything at Any Granularity,” arXiv preprint arXiv:2307.04767, 2023.
- L. Ke, M. Ye, M. Danelljan, Y. Liu, Y.-W. Tai, C.-K. Tang, and F. Yu, “Segment Anything in High Quality,” in NeurIPS, 2023.
- E. Westman, I. Gkioulekas, and M. Kaess, “A Volumetric Albedo Framework for 3D Imaging Sonar Reconstruction,” in 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2020, pp. 9645–9651.
- A. Kim and R. M. Eustice, “Real-time Visual SLAM for Autonomous Underwater Hull Inspection Using Visual Saliency,” IEEE Transactions on Robotics, vol. 29, pp. 719–733, 2013.
- Y. Wang, Y. Ji, D. Liu, H. Tsuchiya, A. Yamashita, and H. Asama, “Elevation Angle Estimation in 2D Acoustic Images Using Pseudo Front View,” IEEE Robotics and Automation Letters, vol. 6, pp. 1535–1542, 2021.
- J. McConnell, J. D. Martin, and B. Englot, “Fusing Concurrent Orthogonal Wide-Aperture Sonar Images for Dense Underwater 3D Reconstruction,” IEEE International Conference on Intelligent Robots and Systems, pp. 1653–1660, 2020.
- E. Westman and M. Kaess, “Degeneracy-Aware Imaging Sonar Simultaneous Localization and Mapping,” IEEE Journal of Oceanic Engineering, vol. 45, no. 4, pp. 1280–1294, 2019.
- F. Ferreira, D. Machado, G. Ferri, S. Dugelay, and J. Potter, “Underwater Optical and Acoustic Imaging: A Time for Fusion? A Brief Overview of the State-of-the-Art,” OCEANS 2016 MTS/IEEE Monterey, OCE 2016, pp. 19–23, 2016.
- S. Negahdaripour, “Epipolar Geometry of Opti-Acoustic Stereo Imaging ,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, pp. 1776–1788, 2007.
- S. Negahdaripour, H. Sekkati, and H. Pirsiavash, “Opti-Acoustic Stereo Imaging: On System Calibration and 3-D Target Reconstruction,” IEEE Transactions on Image Processing, vol. 18, pp. 1203–1214, 2009.
- N. Pecheux, V. Creuze, F. Comby, and O. Tempier, “Self Calibration of a Sonar–Vision System for Underwater Vehicles: A New Method and a Dataset,” Sensors, vol. 23, 2 2023.
- H. Jang, Y. Lee, G. Kim, and A. Kim, “CNN-based Opti-Acoustic Transformation for Underwater Feature Matching,” The Journal of Korea Robotics Society, vol. 15, no. 1, pp. 1–7, 2020.
- H. Jang, S. Yoon, and A. Kim, “Multi-session Underwater Pose-graph SLAM using Inter-session Opti-acoustic Two-view Factor,” vol. 2021-May. Institute of Electrical and Electronics Engineers Inc., 2021, pp. 11 668–11 674.
- P. Yang, H. Liu, M. Roznere, and A. Q. Li, “Monocular Camera and Single-Beam Sonar-Based Underwater Collision-Free Navigation with Domain Randomization,” in The International Symposium of Robotics Research. Springer, 2022, pp. 85–101.
- J. Kim, M. Lee, S. Song, B. Kim, and S. C. Yu, “3-D Reconstruction of Underwater Objects Using Image Sequences from Optical Camera and Imaging Sonar,” OCEANS 2019 MTS/IEEE Seattle, OCEANS 2019, 2019.
- S. Rahman, A. Q. Li, and I. Rekleitis, “SVIn2: An Underwater SLAM System using Sonar, Visual, Inertial, and Depth Sensor ,” The International Journal of Robotics Research, vol. 41, no. 11-12, pp. 1022–1042, 2022.
- Y. Raaj, A. John, and T. Jin, “3D Object Localization Using Forward Looking Sonar (FLS) and Optical Camera via Particle Filter Based Calibration and Fusion,” OCEANS 2016 MTS/IEEE Monterey, OCE 2016, pp. 1–10, 2016.
- L. Cui and C. Ma, “SOF-SLAM: A Semantic Visual SLAM for Dynamic Environments,” IEEE access, vol. 7, pp. 166 528–166 539, 2019.
- J. Civera, D. Gálvez-López, L. Riazuelo, J. D. Tardós, and J. M. M. Montiel, “Towards Semantic SLAM Using a Monocular Camera,” in 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2011, pp. 1277–1284.
- X. Chen, A. Milioto, E. Palazzolo, P. Giguere, J. Behley, and C. Stachniss, “SUMA++: Efficient Lidar-Based Semantic SLAM,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2019, pp. 4530–4537.
- S. L. Bowman, N. Atanasov, K. Daniilidis, and G. J. Pappas, “Probabilistic Data Association for Semantic SLAM,” in 2017 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2017, pp. 1722–1729.
- B. Arain, C. McCool, P. Rigby, D. Cagara, and M. Dunbabin, “Improving Underwater Obstacle Detection using Semantic Image Segmentation,” in 2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 9271–9277.
- K. Himri, P. Ridao, N. Gracias, A. Palomer, N. Palomeras, and R. Pi, “Semantic SLAM for an AUV Using Object Recognition from Point Clouds,” IFAC-PapersOnLine, vol. 51, no. 29, pp. 360–365, 2018, 11th IFAC Conference on Control Applications in Marine Systems, Robotics, and Vehicles CAMS 2018.
- G. Vallicrosa, K. Himri, P. Ridao, and N. Gracias, “Semantic Mapping for Autonomous Subsea Intervention,” Sensors, vol. 21, no. 20, 2021.
- T. Guerneve, K. Subr, and Y. Petillot, “Underwater 3D Structures as Semantic Landmarks in SONAR Mapping,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017, pp. 614–619.
- M. Machado, P. Drews, P. Núñez, and S. Botelho, “Semantic Mapping on Underwater Environment Using Sonar Data,” in 2016 XIII Latin American Robotics Symposium and IV Brazilian Robotics Symposium (LARS/SBR), 2016, pp. 245–250.
- M. dos Santos, P. Drews, P. Núñez, and S. Botelho, “Object Recognition and Semantic Mapping for Underwater Vehicles Using Sonar Data,” Journal of Intelligent & Robotic Systems, vol. 91, pp. 279–289, 2018.
- J. McConnell and B. Englot, “Predictive 3D Sonar Mapping of Underwater Environments via Object-specific Bayesian Inference,” in 2021 IEEE International Conference on Robotics and Automation (ICRA), 2021, pp. 6761–6767.
- L. Jing and Y. Tian, “Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 11, pp. 4037–4058, 2021.
- L. McInnes, J. Healy, and J. Melville, “UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction,” 2020.
- M. Kaess and F. Dellaert, “Covariance Recovery from a Square Root Information Matrix for Data Association,” Robotics and autonomous systems, vol. 57, no. 12, pp. 1198–1210, 2009.
- M. Kaess, H. Johannsson, R. Roberts, V. Ila, J. J. Leonard, and F. Dellaert, “iSAM2: Incremental Smoothing and Mapping Using the Bayes Tree,” The International Journal of Robotics Research, vol. 31, no. 2, pp. 216–235, 2012.
- T. P. Osedach, K. Singh, P. V. Texeira, J. B. Arber, A. Levesque, C. Chahbazian, D. Jain, B. Englot, J. J. Leonard, S. Vannuffelen, and S. Ossia, “Proxy Platform for Underwater Inspection, Maintenance, and Repair,” 2020 Global Oceans 2020: Singapore - U.S. Gulf Coast, pp. 3–7, 2020.
- C. Campos, R. Elvira, J. J. Gomez, J. M. M. Montiel, and J. D. Tardos, “ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM,” IEEE Transactions on Robotics, vol. 37, no. 6, pp. 1874–1890, 2021.