Structure Aware SLAM using Quadrics and Planes (1804.09111v3)

Published 24 Apr 2018 in cs.RO and cs.CV

Abstract: Simultaneous Localization And Mapping (SLAM) is a fundamental problem in mobile robotics. While point-based SLAM methods provide accurate camera localization, the generated maps lack semantic information. On the other hand, state of the art object detection methods provide rich information about entities present in the scene from a single image. This work marries the two and proposes a method for representing generic objects as quadrics which allows object detections to be seamlessly integrated in a SLAM framework. For scene coverage, additional dominant planar structures are modeled as infinite planes. Experiments show that the proposed points-planes-quadrics representation can easily incorporate Manhattan and object affordance constraints, greatly improving camera localization and leading to semantically meaningful maps. The performance of our SLAM system is demonstrated in https://youtu.be/dR-rB9keF8M .

Authors (5)

Mehdi Hosseinzadeh (28 papers)
Yasir Latif (23 papers)
Trung Pham (17 papers)
Niko Suenderhauf (17 papers)
Ian Reid (174 papers)

Citations (52)

View on Semantic Scholar

Summary

Structure Aware SLAM using Quadrics and Planes

The paper "Structure Aware SLAM using Quadrics and Planes" presents an innovative approach to Simultaneous Localization and Mapping (SLAM), leveraging higher-level geometrical representations such as planes and quadrics. The motivation for this research lies in enhancing traditional SLAM methods, which primarily utilize point-based representations, by integrating semantic information through object detection.

Technical Overview

This research addresses the fundamental issue in conventional SLAM whereby the point-based representations, although accurate for camera localization, do not incorporate semantic information necessary for high-level tasks such as object manipulation. The authors propose a SLAM framework that combines object detection with sophisticated representations such as quadrics for objects and infinite planes for dominant planar structures. The novel points-planes-quadrics representation facilitates the integration of Manhattan and object affordance constraints to improve localization and generate semantically meaningful maps.

The paper's technical contribution includes a decomposition approach for dual quadrics that fits integrally within nonlinear least squares optimization typical in SLAM systems. This decomposition aids in maintaining computational efficiency and the sparsity pattern crucial for real-time SLAM operations. Moreover, affordance relationships are introduced via supporting constraints between quadric objects and planes, further refining the localization accuracy. The representation also allows the SLAM system to accommodate Manhattan assumptions, prevalent in structured environments, bringing additional structure to the mapping and localization process.

Results and Implications

Empirical evaluations on publicly available datasets such as TUM RGB-D and NYU Depth V2 demonstrate the efficacy of the proposed method. Notably, scenes with low texture but rich planar structures benefit from the plane representation, significantly improving trajectory accuracy over the point-based SLAM. The inclusion of quadrics allows for a compact yet semantically informative map, with lesser computational overhead.

Numerical results indicate notable improvements in absolute trajectory error across various test scenarios. For instance, using planes and imposing Manhattan constraints led to a substantial reduction in trajectory error, with some cases observing up to 72.77% improvement. Additionally, the integration of all proposed elements—points, planes, quadrics, Manhattan, and tangency constraints—demonstrates a robust increase in map consistency and localization accuracy.

Future Directions

This research paves the way for future investigations into SLAM systems that can handle a wider variety of object types and constraints. Potential developments may explore the incorporation of real-time object detection methods to overcome current runtime limitations. Furthermore, transitioning towards a purely monocular SLAM implementation by hypothesizing plane structures using advancements in deep learning for depth and normal estimation is a promising direction. Enhancing inter-object relationships and refining the conditions under which these constraints are applied may also augment system performance.

In summary, this paper contributes significantly to advancing SLAM systems by incorporating semantic object representations. It successfully bridges the gap between localization precision and semantic map enrichment, thereby setting a foundation for further exploration and development in visual SLAM technologies.

Related Papers

YouTube

Show All Videos