Robust Visual Teach and Repeat for UGVs Using 3D Semantic Maps (2109.10445v3)

Published 21 Sep 2021 in cs.RO

Abstract: We propose a Visual Teach and Repeat (VTR) algorithm using semantic landmarks extracted from environmental objects for ground robots with fixed mount monocular cameras. The proposed algorithm is robust to changes in the starting pose of the camera/robot, where a pose is defined as the planar position plus the orientation around the vertical axis. VTR consists of a teach phase in which a robot moves in a prescribed path, and a repeat phase in which the robot tries to repeat the same path starting from the same or a different pose. Most available VTR algorithms are pose dependent and cannot perform well in the repeat phase when starting from an initial pose far from that of the teach phase. To achieve more robust pose independency, the key is to generate a 3D semantic map of the environment containing the camera trajectory and the positions of surrounding objects during the teach phase. For specific implementation, we use ORB-SLAM to collect the camera poses and the 3D point clouds of the environment, and YOLOv3 to detect objects in the environment. We then combine the two outputs to build the semantic map. In the repeat phase, we relocalize the robot based on the detected objects and the stored semantic map. The robot is then able to move toward the teach path, and repeat it in both forward and backward directions. We have tested the proposed algorithm in different scenarios and compared it with two most relevant recent studies. Also, we compared our algorithm with two image-based relocalization methods. One is purely based on ORB-SLAM and the other combines Superglue and RANSAC. The results show that our algorithm is much more robust with respect to pose variations as well as environmental alterations. Our code and data are available at the following Github page: https://github.com/mmahdavian/semantic_visual_teach_repeat.

Authors (3)

Mohammad Mahdavian (6 papers)
KangKang Yin (9 papers)
Mo Chen (95 papers)

Citations (3)

View on Semantic Scholar

Summary

The paper introduces a novel VTR algorithm that uses 3D semantic maps to achieve pose-independent navigation for UGVs.
It integrates ORB-SLAM and YOLOv3 to extract semantic landmarks and construct 3D maps, ensuring robust relocalization during path repetition.
The system demonstrates reliable performance in dynamic, GPS-denied environments, enabling both forward and reverse path navigation.

Robust Visual Teach and Repeat for UGVs Using 3D Semantic Maps: An Analytical Overview

The paper "Robust Visual Teach and Repeat for UGVs Using 3D Semantic Maps" introduces a novel Visual Teach and Repeat (VTR) algorithm tailored for Unmanned Ground Vehicles (UGVs), which integrates three-dimensional semantic mapping to enhance navigation robustness. This algorithm addresses existing challenges in traditional VTR systems, particularly those related to pose dependency and environmental changes.

The proposed algorithm employs semantic landmarks, extracted from environmental objects, to construct a 3D semantic map that includes both the camera trajectory and semantic object positions during the teach phase. The method leverages the output of ORB-SLAM, a widely used visual SLAM system, to capture camera poses and generate 3D point clouds, alongside YOLOv3, a CNN-based object detection model, to identify and categorize objects within the environment. This strategic integration facilitates pose-independent navigation by enabling robust relocalization in the repeat phase, even when the robot starts from an initial pose significantly different from that during the teach phase.

Key Contributions

Pose Independence and Robust Relocalization: The paper asserts that the algorithm achieves superior performance in terms of pose independence by constructing a detailed semantic map. This approach contrasts with typical VTR methods that rely solely on local image features or 2D maps, which often fail under significant pose variation or altered environmental conditions.
Backward and Forward Path Repetition: A significant highlight of the research is the demonstrated capability of repeating both forward and reverse teach paths. This adaptability is a distinct improvement over existing methods that predominantly focus on forward path repetition.
Environmental Robustness: The paper claims robust performance amidst changes in the environment, such as object relocations, provided that fewer than half the semantic objects are moved. This robustness is reportedly due to the algorithm's reliance on semantic features rather than precise spatial features.
Comparative Performance Analysis: The authors conducted extensive evaluations, comparing their approach to contemporary VTR algorithms and image-based relocalization methods, such as Superglue combined with RANSAC and pure ORB-SLAM. The empirical results presented showcase the proposed system's enhanced accuracy in navigating complex environments with disparate initial poses.

Implications and Future Directions

From a practical standpoint, this research opens up applications for UGVs in scenarios devoid of reliable GPS signals, such as indoor environments or subterranean settings, by reducing dependency on initial pose conditions. The use of semantic maps supports not only navigation but also potential future integrations in dynamic scenario analyses where object movements are frequent.

Theoretically, this work invites further exploration into 3D semantic mapping techniques, perhaps integrating more advanced machine learning models for object recognition and localization. Future research could also investigate scaling the system for larger outdoor environments and improving robustness to extreme environmental dynamics. Additionally, considering dynamic obstacles and multiple robotic agents operating in tandem could extend the algorithm's applicability in more expansive and complex settings.

In conclusion, the paper contributes a substantial advancement in VTR methodologies by achieving a balance between environmental adaptability and pose independence, laying a foundation for more robust autonomous robotic navigation systems in challenging settings.

PDF Markdown

Related Papers

GitHub

GitHub - mmahdavian/semantic_visual_teach_repeat: This is a robotic package for an algorithm for visual teach and repeat (11 stars)

YouTube

Show All Videos