Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LaneSegNet: Map Learning with Lane Segment Perception for Autonomous Driving (2312.16108v2)

Published 26 Dec 2023 in cs.CV

Abstract: A map, as crucial information for downstream applications of an autonomous driving system, is usually represented in lanelines or centerlines. However, existing literature on map learning primarily focuses on either detecting geometry-based lanelines or perceiving topology relationships of centerlines. Both of these methods ignore the intrinsic relationship of lanelines and centerlines, that lanelines bind centerlines. While simply predicting both types of lane in one model is mutually excluded in learning objective, we advocate lane segment as a new representation that seamlessly incorporates both geometry and topology information. Thus, we introduce LaneSegNet, the first end-to-end mapping network generating lane segments to obtain a complete representation of the road structure. Our algorithm features two key modifications. One is a lane attention module to capture pivotal region details within the long-range feature space. Another is an identical initialization strategy for reference points, which enhances the learning of positional priors for lane attention. On the OpenLane-V2 dataset, LaneSegNet outperforms previous counterparts by a substantial gain across three tasks, \textit{i.e.}, map element detection (+4.8 mAP), centerline perception (+6.9 DET$_l$), and the newly defined one, lane segment perception (+5.6 mAP). Furthermore, it obtains a real-time inference speed of 14.7 FPS. Code is accessible at https://github.com/OpenDriveLab/LaneSegNet.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Structured bird’s-eye-view traffic scene understanding from onboard images. In ICCV, 2021.
  2. Topology preserving local road network estimation from single onboard camera image. In CVPR, 2022.
  3. PersFormer: 3D lane detection via perspective transformer and the openlane benchmark. In ECCV, 2022.
  4. Per-pixel classification is not all you need for semantic segmentation. In NeurIPS, 2021.
  5. Masked-attention mask transformer for universal image segmentation. In CVPR, 2022.
  6. PivotNet: Vectorized pivot learning for end-to-end hd map construction. In ICCV, 2023.
  7. Sparse dense fusion for 3d object detection. In IROS, 2023.
  8. Deep residual learning for image recognition. In CVPR, 2016.
  9. Planning-oriented autonomous driving. In CVPR, 2023.
  10. Anchor3dlane: Learning to regress 3d anchors for monocular 3d lane detection. In CVPR, 2023.
  11. Open-sourced data ecosystem in autonomous driving: the present and future. arXiv preprint arXiv:2312.03408, 2023a.
  12. Delving into the devils of bird’s-eye-view perception: A review, evaluation and recipe. PAMI, 2023b.
  13. Hdmapnet: An online hd map construction and evaluation framework. In ICRA, 2022a.
  14. Graph-based topology reasoning for driving scenes. arXiv preprint arXiv:2304.05277, 2023c.
  15. BEVFormer: Learning bird’s-eye-view representation from multi-camera images via spatiotemporal transformers. In ECCV, 2022b.
  16. Learning lane graph representations for motion forecasting. In ECCV, 2020.
  17. Lane graph as path: Continuity-preserving path-wise modeling for online lane graph construction. arXiv preprint arXiv:2303.08815, 2023a.
  18. MapTR: Structured modeling and learning for online vectorized HD map construction. In ICLR, 2023b.
  19. MapTRv2: An end-to-end framework for online vectorized hd map construction. arXiv preprint arXiv:2308.05736, 2023c.
  20. Feature pyramid networks for object detection. In CVPR, 2017a.
  21. Focal loss for dense object detection. In ICCV, 2017b.
  22. VectorMapNet: End-to-end vectorized hd map learning. In ICML, 2023.
  23. Decoupled weight decay regularization. In ICLR, 2018.
  24. LATR: 3d lane detection from monocular images with transformer. In ICCV, 2023.
  25. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In 3DV, 2016.
  26. Lift, splat, shoot: Encoding images from arbitrary camera rigs by implicitly unprojecting to 3D. In ECCV, 2020.
  27. Drivelm: Driving with graph visual question answering. arXiv preprint arXiv:2312.14150, 2023.
  28. Tesla. Tesla AI Day. https://www.youtube.com/watch?v=ODSJsviD_SU, 2022.
  29. Openlane-v2: A topology reasoning benchmark for scene understanding in autonomous driving. In NeurIPS Track Datasets and Benchmarks, 2023.
  30. Argoverse 2: Next generation datasets for self-driving perception and forecasting. In NeurIPS, 2021.
  31. CenterLineDet: Road lane centerline graph detection with vehicle-mounted sensors by transformer for high-definition map creation. In ICRA, 2023.
  32. StreamMapNet: Streaming mapping network for vectorized online hd map construction. In WACV, 2024.
  33. Cross-view transformers for real-time map-view semantic segmentation. In CVPR, 2022.
  34. Deformable detr: Deformable transformers for end-to-end object detection. In ICLR, 2020.
Citations (23)

Summary

  • The paper introduces a unified lane segment representation that integrates geometric and topological data for comprehensive map learning in autonomous driving.
  • It employs a novel lane attention module with a heads-to-regions mechanism, enhancing both local and long-range feature extraction.
  • The approach, using an identical initialization strategy, achieves significant improvements in mAP and 14.7 FPS on the OpenLane-V2 benchmark.

Overview of "LaneSegNet: Map Learning with Lane Segment Perception for Autonomous Driving"

The paper introduces LaneSegNet, an innovative approach towards map learning in the domain of autonomous driving. This work tackles the challenge of gaining a comprehensive understanding of road structures by integrating geometric and topological data into a unified representation called lane segments. The authors propose this methodology as an alternative to traditional map learning methods, which predominantly focus on lanelines or centerlines separately, thereby failing to capture the intrinsic connections between these elements.

Key Contributions

LaneSegNet makes strides in autonomous driving by presenting several technical innovations:

  1. Unified Representation: The proposed lane segment representation encompasses both the geometric boundaries and topological connections required to construct a lane graph. This holistic view captures detailed road information, including lane types and directions, critical for trajectory planning and decision-making in autonomous driving systems.
  2. Lane Attention Module: This module is pivotal to the LaneSegNet architecture. It introduces a heads-to-regions mechanism that enables the model to gather long-range contextual information and to discern local features accurately within the larger feature space. This method enhances the model's ability to interpret complex road geometry effectively.
  3. Identical Initialization Strategy: By initializing reference points identically, the network simplifies the learning process for positional priors, thereby improving the overall stability and accuracy of training.

LaneSegNet is evaluated using the OpenLane-V2 benchmark, where it demonstrates substantial improvements across multiple tasks: map element detection, centerline perception, and the newly introduced lane segment perception. The model achieves notable gains in mean Average Precision (mAP) and real-time inference speed (14.7 FPS), underscoring its potential for real-time applications.

Methodology and Evaluation

The paper outlines a detailed methodology for lane segment perception, involving three components: an encoder for BEV feature extraction, a lane segment decoder featuring the lane attention module, and a lane segment predictor. The approach combines these to predict lane segments, thereby reconstructing a comprehensive map of the road environment.

Several evaluation metrics were used to measure the performance of LaneSegNet, notably map element detection, centerline perception, and a suite of metrics designed for lane segment perception. Across these metrics, LaneSegNet showed superior performance compared to prior methods like MapTR and TopoNet, especially in terms of precision in detecting and mapping lane-oriented features.

Implications and Future Work

The implications of this research are significant for the field of autonomous vehicles. By enhancing the perception of road elements through a unified framework, the model can potentially lead to more robust and efficient autonomous navigation systems. The improved accuracy in map learning could directly translate to better navigation, planning, and safety in real-world settings.

Moving forward, the authors suggest exploring more sophisticated backbones and expanding the approach's applicability to other datasets, such as nuScenes and Waymo, which were not considered in this paper. Additionally, the potential of LaneSegNet in benefiting downstream applications, such as trajectory prediction and path planning, is a promising future direction for research.

This paper marks an important step towards enhancing map learning in autonomous driving by introducing a new paradigm focused on holistic lane segment perception and integrating advanced machine learning techniques into the autonomous driving stack.

Github Logo Streamline Icon: https://streamlinehq.com