Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Leveraging Enhanced Queries of Point Sets for Vectorized Map Construction (2402.17430v2)

Published 27 Feb 2024 in cs.CV

Abstract: In autonomous driving, the high-definition (HD) map plays a crucial role in localization and planning. Recently, several methods have facilitated end-to-end online map construction in DETR-like frameworks. However, little attention has been paid to the potential capabilities of exploring the query mechanism for map elements. This paper introduces MapQR, an end-to-end method with an emphasis on enhancing query capabilities for constructing online vectorized maps. To probe desirable information efficiently, MapQR utilizes a novel query design, called scatter-and-gather query, which is modelled by separate content and position parts explicitly. The base map instance queries are scattered to different reference points and added with positional embeddings to probe information from BEV features. Then these scatted queries are gathered back to enhance information within each map instance. Together with a simple and effective improvement of a BEV encoder, the proposed MapQR achieves the best mean average precision (mAP) and maintains good efficiency on both nuScenes and Argoverse 2. In addition, integrating our query design into other models can boost their performance significantly. The source code is available at https://github.com/HXMap/MapQR.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. nuScenes: A multimodal dataset for autonomous driving. In CVPR, pages 11621–11631, 2020.
  2. End-to-end object detection with transformers. In ECCV, pages 213–229, 2020.
  3. PersFormer: 3D lane detection via perspective transformer and the OpenLane benchmark. In ECCV, pages 550–567, 2022a.
  4. Efficient and robust 2D-to-BEV representation learning via geometry-guided kernel transformer. arXiv preprint arXiv:2206.04584, 2022b.
  5. Deformable convolutional networks. In ICCV, pages 764–773, 2017.
  6. PivotNet: Vectorized pivot learning for end-to-end HD map construction. In ICCV, pages 3672–3682, 2023.
  7. TBP-Former: Learning temporal bird’s-eye-view pyramid for joint perception and prediction in vision-centric autonomous driving. In CVPR, pages 1368–1378, 2023.
  8. Rethinking efficient lane detection via curve modeling. In CVPR, pages 17062–17070, 2022.
  9. SkyEye: Self-supervised bird’s-eye-view semantic mapping using monocular frontal view images. In CVPR, pages 14901–14910, 2023.
  10. Deep residual learning for image recognition. In CVPR, pages 770–778, 2016.
  11. BEVPoolv2: A cutting-edge implementation of BEVDet toward deployment. arXiv preprint arXiv:2211.17111, 2022.
  12. DN-DETR: Accelerate DETR training by introducing query denoising. In CVPR, pages 13619–13627, 2022a.
  13. HDMapNet: An online HD map construction and evaluation framework. In ICRA, pages 4628–4634, 2022b.
  14. BEVformer: Learning bird’s-eye-view representation from multi-camera images via spatiotemporal transformers. In ECCV, pages 1–18, 2022c.
  15. MapTR: Structured modeling and learning for online vectorized HD map construction. In ICLR, 2022.
  16. MapTRv2: An end-to-end framework for online vectorized HD map construction. arXiv preprint arXiv:2308.05736, 2023.
  17. DAB-DETR: Dynamic anchor boxes are better queries for DETR. In ICLR, 2022.
  18. VectorMapNet: End-to-end vectorized HD map learning. In ICML, pages 22352–22369, 2023.
  19. Swin transformer: Hierarchical vision transformer using shifted windows. In ICCV, pages 10012–10022, 2021.
  20. Conditional DETR for fast training convergence. In ICCV, pages 3651–3660, 2021.
  21. SOLOFusion: Time will tell: New outlooks and a baseline for temporal multi-view 3D object detection. In ICLR, 2023.
  22. Lift, splat, shoot: Encoding images from arbitrary camera rigs by implicitly unprojecting to 3D. In ECCV, pages 194–210, 2020.
  23. End-to-end vectorized HD-map construction with piecewise Bézier curve. In CVPR, pages 13218–13228, 2023.
  24. LeGO-LOAM: Lightweight and ground-optimized lidar odometry and mapping on variable terrain. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 4758–4765, 2018.
  25. InstaGraM: Instance-level graph modeling for vectorized HD map learning. In CVPR Workshop on Vision-Centric Autonomous Driving (VCAD), 2023.
  26. Attention is all you need. In NeurIPS, pages 6000–6010, 2017.
  27. BEV-LaneDet: An efficient 3D lane detection based on virtual camera via key-points. In CVPR, pages 1002–1011, 2023.
  28. Anchor DETR: Query design for transformer-based detector. In AAAI, pages 2567–2575, 2022.
  29. Benjamin Wilson et al. Argoverse 2: Next generation datasets for self-driving perception and forecasting. In NeurIPS 2021 Track on Datasets and Benchmarks, 2021.
  30. InsightMapper: A closer look at inner-instance information for vectorized high-definition mapping. arXiv preprint arXiv:2308.08543, 2023.
  31. ScalableMap: Scalable map learning for online long-range vectorized HD map construction. In CoRL, 2023.
  32. StreamMapNet: Streaming mapping network for vectorized online HD map construction. arXiv preprint arXiv:2308.12570, 2023.
  33. Online map vectorization for autonomous driving: A rasterization perspective. In NeurIPS, 2023a.
  34. DINO: DETR with improved denoising anchor boxes for end-to-end object detection. In ICLR, 2023b.
  35. LOAM: Lidar odometry and mapping in real-time. In Robotics: Science and Systems, pages 1–9, 2014.
  36. Cross-view transformers for real-time map-view semantic segmentation. In CVPR, pages 13760–13769, 2022.
  37. Deformable DETR: Deformable transformers for end-to-end object detection. In ICLR, 2021.
Citations (9)

Summary

We haven't generated a summary for this paper yet.