3DS-SLAM: A 3D Object Detection based Semantic SLAM towards Dynamic Indoor Environments (2310.06385v1)
Abstract: The existence of variable factors within the environment can cause a decline in camera localization accuracy, as it violates the fundamental assumption of a static environment in Simultaneous Localization and Mapping (SLAM) algorithms. Recent semantic SLAM systems towards dynamic environments either rely solely on 2D semantic information, or solely on geometric information, or combine their results in a loosely integrated manner. In this research paper, we introduce 3DS-SLAM, 3D Semantic SLAM, tailored for dynamic scenes with visual 3D object detection. The 3DS-SLAM is a tightly-coupled algorithm resolving both semantic and geometric constraints sequentially. We designed a 3D part-aware hybrid transformer for point cloud-based object detection to identify dynamic objects. Subsequently, we propose a dynamic feature filter based on HDBSCAN clustering to extract objects with significant absolute depth differences. When compared against ORB-SLAM2, 3DS-SLAM exhibits an average improvement of 98.01% across the dynamic sequences of the TUM RGB-D dataset. Furthermore, it surpasses the performance of the other four leading SLAM systems designed for dynamic environments.
- Sensor technologies and simultaneous localization and mapping (slam). Procedia Computer Science, 76:174–179, 2015.
- Visual simultaneous localization and mapping: a survey. Artificial intelligence review, 43:55–81, 2015.
- Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE transactions on robotics, 33(5):1255–1262, 2017.
- Lsd-slam: Large-scale direct monocular slam. In European conference on computer vision, pages 834–849. Springer, 2014.
- 3-d mapping with an rgb-d camera. IEEE transactions on robotics, 30:177–187, 2013.
- Slam in the field: An evaluation of monocular mapping and localization on challenging dynamic agricultural environment. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, pages 1761–1771, 2021.
- Detect-slam: Making object detection and slam mutually beneficial. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1001–1010. IEEE, 2018.
- Blitz-slam: A semantic slam in dynamic environments. Pattern Recognition, 121:108225, 2022.
- Automatic super-surface removal in complex 3d indoor environments using iterative region-based ransac. Sensors, 21(11):3724, 2021.
- An approach to boundary detection for 3d point clouds based on dbscan clustering. Pattern Recognition, 124:108431, 2022.
- Cfp-slam: A real-time visual slam based on coarse-to-fine probability in dynamic environments. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 4399–4406. IEEE, 2022.
- Ds-slam: A semantic visual slam towards dynamic environments. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1168–1174, 2018.
- Dynaslam ii: Tightly-coupled multi-object tracking and slam. IEEE Robotics and Automation Letters, 6(3):5191–5198, 2021.
- Sof-slam: A semantic visual slam for dynamic environments. IEEE access, 7:166528–166539, 2019.
- Sad-slam: A visual slam based on semantic and depth information. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 4930–4935. IEEE, 2020.
- Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 2961–2969, 2017.
- Segnet: A deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling. arXiv preprint arXiv:1505.07293, 2015.
- Ds-slam: A semantic visual slam towards dynamic environments. In 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS), pages 1168–1174. IEEE, 2018.
- Yolo-slam: A semantic slam system towards dynamic environment with geometric constraint. Neural Computing and Applications, pages 1–16, 2022.
- Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.
- Unifying voxel-based representation with transformer for 3d object detection. Advances in Neural Information Processing Systems, 35:18442–18455, 2022.
- An end-to-end transformer model for 3d object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2906–2917, 2021.
- Deep hough voting for 3d object detection in point clouds. In proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9277–9286, 2019.
- Scannet: Richly-annotated 3d reconstructions of indoor scenes. In Proc. Computer Vision and Pattern Recognition (CVPR), IEEE, 2017.
- Sun rgb-d: A rgb-d scene understanding benchmark suite. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 567–576, 2015.
- Pvt-ssd: Single-stage 3d object detector with point-voxel transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13476–13487, 2023.
- Hvnet: Hybrid voxel network for lidar based 3d object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1631–1640, 2020.
- From points to parts: 3d object detection from point cloud with part-aware and part-aggregation network. IEEE transactions on pattern analysis and machine intelligence, 43(8):2647–2664, 2020.
- Hongzhi Tong. Functional linear regression with huber loss. Journal of Complexity, 74:101696, 2023.
- The real-world-weight cross-entropy loss function: Modeling the costs of mislabeling. IEEE access, 8:4806–4813, 2019.
- A benchmark for the evaluation of rgb-d slam systems. In Proc. of the International Conference on Intelligent Robot Systems (IROS), Oct. 2012.
- Measuring robustness of visual slam. In 2019 16th International Conference on Machine Vision Applications (MVA), pages 1–6. IEEE, 2019.
- Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.
- A survey on performance metrics for object-detection algorithms. In 2020 international conference on systems, signals and image processing (IWSSIP), pages 237–242. IEEE, 2020.
- Dynaslam: Tracking, mapping, and inpainting in dynamic scenes. IEEE Robotics and Automation Letters, 3(4):4076–4083, 2018.
- Rds-slam: Real-time dynamic slam using semantic segmentation methods. IEEE Access, 9:23772–23785, 2021.
- Enhancement of missing face prediction algorithm with kalman filter and dcf-csr. In 2019 International Conference on Electrical Engineering and Informatics (ICEEI), pages 395–399. IEEE, 2019.
- Transforming a 3-d lidar point cloud into a 2-d dense depth map through a parameter self-adaptive framework. IEEE Transactions on Intelligent Transportation Systems, 18(1):165–176, 2016.
- Ghanta Sai Krishna (7 papers)
- Kundrapu Supriya (5 papers)
- Sabur Baidya (20 papers)