HDMapNet: An Online HD Map Construction and Evaluation Framework (2107.06307v4)

Published 13 Jul 2021 in cs.CV and cs.AI

Abstract: Constructing HD semantic maps is a central component of autonomous driving. However, traditional pipelines require a vast amount of human efforts and resources in annotating and maintaining the semantics in the map, which limits its scalability. In this paper, we introduce the problem of HD semantic map learning, which dynamically constructs the local semantics based on onboard sensor observations. Meanwhile, we introduce a semantic map learning method, dubbed HDMapNet. HDMapNet encodes image features from surrounding cameras and/or point clouds from LiDAR, and predicts vectorized map elements in the bird's-eye view. We benchmark HDMapNet on nuScenes dataset and show that in all settings, it performs better than baseline methods. Of note, our camera-LiDAR fusion-based HDMapNet outperforms existing methods by more than 50% in all metrics. In addition, we develop semantic-level and instance-level metrics to evaluate the map learning performance. Finally, we showcase our method is capable of predicting a locally consistent map. By introducing the method and metrics, we invite the community to study this novel map learning problem.

Authors (4)

Qi Li (354 papers)
Yue Wang (676 papers)
Yilun Wang (39 papers)
Hang Zhao (156 papers)

Citations (227)

View on Semantic Scholar

Summary

The paper introduces HDMapNet, an online framework that fuses camera and LiDAR data to build dynamic semantic maps with over a 50% performance gain.
It employs geometric projections and neural feature extraction to convert perspective images into accurate bird’s-eye view representations.
The framework minimizes manual map creation, offering a scalable, cost-effective solution for real-time autonomous driving applications.

HDMapNet: An Online HD Map Construction and Evaluation Framework

The paper presents HDMapNet, a comprehensive framework for constructing high-definition (HD) semantic maps specifically tailored for autonomous driving applications. Unlike traditional mapping pipelines that require extensive manual annotation and significant human resources to maintain map semantics, HDMapNet uses onboard sensor data to dynamically estimate local semantic maps. This approach highlights potential advancements in autonomous driving technology, addressing limitations associated with scalability and resource investment.

HDMapNet utilizes a multi-modal dataset approach, integrating inputs from both cameras and LiDAR sensors to produce vectorized map elements in a bird's-eye view representation. The framework includes a view transformation module designed to effectively convert features from perspective view images to bird's-eye view, even when depth data is unavailable. The model utilizes a blend of geometric projections and neural feature extraction to maximize the effective combination of visual data and 3D environmental modeling.

Key results from evaluations conducted on the publicly available nuScenes dataset emphasize HDMapNet's superior performance over alternative methods, with a noted improvement of over 50% in all metrics when leveraging both camera and LiDAR data. This highlights the benefit of multi-modal input in complex map element recognition tasks. Additionally, HDMapNet introduces specialized metrics for assessing performance at both semantic and instance levels, providing insights into the nuanced efficacy of the proposed methodology.

The framework shows promise in generating locally consistent maps necessary for real-time motion planning in autonomous vehicles. By lowering the reliance on pre-existing global HD maps, HDMapNet offers a scalable and cost-effective mapping solution, potentially broadening the feasibility of autonomous navigation systems.

This research has implications for both practical implementation and theoretical exploration within the field of AI-driven autonomous systems. Practical applications include more responsive and adaptive autonomous vehicles that can seamlessly integrate into dynamic environments. Theoretically, this work provides a foundation for further research into AI models' capabilities in processing and understanding complex spatial data in real-time settings.

Future developments could focus on refining sensor fusion strategies, as the fusion of LiDAR and camera inputs provides marked advantages over single-modality solutions. Integrating dynamic temporal data into models may further enhance system consistency and accuracy, facilitating more advanced autonomous driving capabilities. The HDMapNet framework, through its innovative approach to HD map construction, represents a step towards realizing more scalable and adaptable intelligent transportation systems.

PDF Markdown

HDMapNet: An Online HD Map Construction and Evaluation Framework (2107.06307v4)

Summary

HDMapNet: An Online HD Map Construction and Evaluation Framework

Related Papers