Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
143 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Online Map Vectorization for Autonomous Driving: A Rasterization Perspective (2306.10502v2)

Published 18 Jun 2023 in cs.CV

Abstract: Vectorized high-definition (HD) map is essential for autonomous driving, providing detailed and precise environmental information for advanced perception and planning. However, current map vectorization methods often exhibit deviations, and the existing evaluation metric for map vectorization lacks sufficient sensitivity to detect these deviations. To address these limitations, we propose integrating the philosophy of rasterization into map vectorization. Specifically, we introduce a new rasterization-based evaluation metric, which has superior sensitivity and is better suited to real-world autonomous driving scenarios. Furthermore, we propose MapVR (Map Vectorization via Rasterization), a novel framework that applies differentiable rasterization to vectorized outputs and then performs precise and geometry-aware supervision on rasterized HD maps. Notably, MapVR designs tailored rasterization strategies for various geometric shapes, enabling effective adaptation to a wide range of map elements. Experiments show that incorporating rasterization into map vectorization greatly enhances performance with no extra computational cost during inference, leading to more accurate map perception and ultimately promoting safer autonomous driving.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. CurveFormer: 3D lane detection by curve propagation with curve queries and attention. In ICRA, 2023.
  2. nuScenes: A multimodal dataset for autonomous driving. In CVPR, 2020.
  3. End-to-end object detection with Transformers. In ECCV, 2020.
  4. PersFormer: 3D lane detection via perspective transformer and the openlane benchmark. In ECCV, 2022a.
  5. Efficient and robust 2D-to-BEV representation learning via geometry-guided kernel transformer. arXiv preprint arXiv:2206.04584, 2022b.
  6. Learning to predict 3D objects with an interpolation-based differentiable renderer. In NeurIPS, 2019.
  7. SuperFusion: Multilevel LiDAR-Camera Fusion for Long-Range HD Map Generation. arXiv preprint arXiv:2211.15656, 2022.
  8. The Pascal Visual Object Classes (VOC) Challenge. International Journal of Computer Vision, 88(2):303–338, 2010.
  9. Rethinking efficient lane detection via curve modeling. In CVPR, 2022.
  10. A characterization of ten rasterization techniques. In SIGGRAPH, 1989.
  11. Gen-LaneNet: A generalized and scalable approach for 3d lane detection. In ECCV, 2020.
  12. Deep residual learning for image recognition. In CVPR, 2016.
  13. FIERY: future instance prediction in bird’s-eye view from surround monocular cameras. In ICCV, 2021.
  14. Anchor3DLane: Learning to regress 3D anchors for monocular 3D lane detection. In CVPR, 2023.
  15. Harold W Kuhn. The hungarian method for the assignment problem. Naval research logistics quarterly, 2(1-2):83–97, 1955.
  16. Instance segmentation with mask-supervised polygonal boundary transformers. In CVPR, 2022.
  17. HDMapNet: An online HD map construction and evaluation framework. In ICRA, 2022a.
  18. Differentiable vector graphics rasterization for editing and learning. ACM Transactions on Graphics, 39(6):1–15, 2020.
  19. Line-CNN: End-to-end traffic line detection with line proposal unit. IEEE Transactions on Intelligent Transportation Systems, 21(1):248–258, 2019.
  20. BEVFormer: Learning bird’s-eye-view representation from multi-camera images via spatiotemporal transformers. In ECCV, 2022b.
  21. MapTR: Structured modeling and learning for online vectorized HD map construction. In ICLR, 2023.
  22. Towards unsupervised learning of generative models for 3D controllable image synthesis. In CVPR, 2020.
  23. Microsoft COCO: Common objects in context. In ECCV, 2014.
  24. Focal loss for dense object detection. In ICCV, 2017.
  25. End-to-end lane shape prediction with transformers. In WACV, 2021.
  26. Soft rasterizer: A differentiable renderer for image-based 3d reasoning. In ICCV, 2019.
  27. A general differentiable mesh renderer for image-based 3D reasoning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(1):50–62, 2020.
  28. VectorMapNet: End-to-end vectorized HD map learning. arXiv preprint arXiv:2206.08920, 2022a.
  29. PETR: Position embedding transformation for multi-view 3D object detection. In ECCV, 2022b.
  30. OpenDR: An approximate differentiable renderer. In ECCV, 2014.
  31. DETR4D: Direct multi-view 3D object detection with sparse attention. arXiv preprint arXiv:2212.07849, 2022.
  32. TransPillars: Coarse-to-fine aggregation for multi-frame 3D object detection. In WACV, 2023.
  33. V-Net: Fully convolutional neural networks for volumetric medical image segmentation. In 3DV, 2016.
  34. ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Transactions on Robotics, 33(5):1255–1262, 2017.
  35. Towards end-to-end lane detection: an instance segmentation approach. In IEEE Intelligent Vehicles Symposium (IV), 2018.
  36. Differentiable volumetric rendering: Learning implicit 3d representations without 3d supervision. In CVPR, 2020.
  37. Spatial as deep: Spatial CNN for traffic scene understanding. In AAAI, 2018.
  38. BEVSegFormer: Bird’s eye view semantic segmentation from arbitrary camera rigs. In WACV, 2023.
  39. Lift, splat, shoot: Encoding images from arbitrary camera rigs by implicitly unprojecting to 3d. In ECCV, 2020.
  40. Juan Pineda. A parallel algorithm for polygon rasterization. In SIGGRAPH, 1988.
  41. Ultra fast structure-aware deep lane detection. In ECCV, 2020.
  42. Predicting semantic map representations from images using pyramid occupancy networks. In CVPR, 2020.
  43. LeGO-LOAM: Lightweight and ground-optimized lidar odometry and mapping on variable terrain. In IROS, 2018.
  44. LIO-SAM: Tightly-coupled lidar inertial odometry via smoothing and mapping. In IROS, 2020.
  45. Keep your eyes on the lane: Real-time attention-guided lane detection. In CVPR, 2021a.
  46. PolyLaneNet: Lane estimation via deep polynomial regression. In ICPR, 2021b.
  47. A survey of smooth vector graphics: Recent advances in representation, creation, rasterization and image vectorization. IEEE Transactions on Visualization and Computer Graphics, 2022. doi: 10.1109/TVCG.2022.3220575.
  48. End-to-end lane detection through differentiable least-squares fitting. In ICCV Workshops, 2019.
  49. Attention is all you need. In NeurIPS, 2017.
  50. BEV-LaneDet: a simple and effective 3d lane detection baseline. In CVPR, 2023.
  51. DETR3D: 3D object detection from multi-view images via 3D-to-2D queries. In CoRL, 2022.
  52. Argoverse 2: Next generation datasets for self-driving perception and forecasting. In NeurIPS (Datasets and Benchmarks Track), 2021.
  53. Neural map prior for autonomous driving. In CVPR, 2023.
  54. BEVFormer v2: Adapting modern image backbones to bird’s-eye-view recognition via perspective supervision. arXiv preprint arXiv:2211.10439, 2022.
  55. End-to-end lane marker detection via row-wise classification. In CVPR Workshops, 2020.
  56. CAD-Net: A context-aware detection network for objects in remote sensing imagery. IEEE Transactions on Geoscience and Remote Sensing, 57(12):10015–10024, 2019.
  57. PNPDet: Efficient few-shot detection without forgetting via plug-and-play sub-networks. In WACV, 2021.
  58. Accelerating DETR convergence via semantic-aligned matching. In CVPR, 2022a.
  59. Meta-DETR: Image-level few-shot detection with inter-class correlation exploitation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(11):12832–12843, 2023a. doi: 10.1109/TPAMI.2022.3195735.
  60. Towards efficient use of multi-scale features in transformer-based object detectors. In CVPR, 2023b.
  61. Ji Zhang and Sanjiv Singh. LOAM: LiDAR odometry and mapping in real-time. In Robotics: Science and Systems, 2014.
  62. BEVerse: Unified perception and prediction in birds-eye-view for vision-centric autonomous driving. arXiv preprint arXiv:2205.09743, 2022b.
  63. Cross-view transformers for real-time map-view semantic segmentation. In CVPR, 2022.
  64. Deformable DETR: Deformable transformers for end-to-end object detection. In ICLR, 2021.
Citations (24)

Summary

We haven't generated a summary for this paper yet.