Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MachMap: End-to-End Vectorized Solution for Compact HD-Map Construction (2306.10301v1)

Published 17 Jun 2023 in cs.CV

Abstract: This report introduces the 1st place winning solution for the Autonomous Driving Challenge 2023 - Online HD-map Construction. By delving into the vectorization pipeline, we elaborate an effective architecture, termed as MachMap, which formulates the task of HD-map construction as the point detection paradigm in the bird-eye-view space with an end-to-end manner. Firstly, we introduce a novel map-compaction scheme into our framework, leading to reducing the number of vectorized points by 93% without any expression performance degradation. Build upon the above process, we then follow the general query-based paradigm and propose a strong baseline with integrating a powerful CNN-based backbone like InternImage, a temporal-based instance decoder and a well-designed point-mask coupling head. Additionally, an extra optional ensemble stage is utilized to refine model predictions for better performance. Our MachMap-tiny with IN-1K initialization achieves a mAP of 79.1 on the Argoverse2 benchmark and the further improved MachMap-huge reaches the best mAP of 83.5, outperforming all the other online HD-map construction approaches on the final leaderboard with a distinct performance margin (> 9.8 mAP at least).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (16)
  1. Deformable convolutional networks. In Proceedings of the IEEE international conference on computer vision, pages 764–773, 2017.
  2. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
  3. Exploring recurrent long-term temporal fusion for multi-view 3d perception. arXiv, 2023.
  4. Bevdet: High-performance multi-camera 3d object detection in bird-eye-view. arXiv preprint arXiv:2112.11790, 2021.
  5. Dn-detr: Accelerate detr training by introducing query denoising. In CVPR, 2022.
  6. Mask dino: Towards a unified transformer-based framework for object detection and segmentation. In CVPR, 2023.
  7. Maptr: Structured modeling and learning for online vectorized hd map construction. arXiv preprint arXiv:2208.14437, 2022.
  8. Decoupled weight decay regularization. 2019.
  9. V-net: Fully convolutional neural networks for volumetric medical image segmentation. IEEE, 2016.
  10. End-to-end vectorized hd-map construction with piecewise bezier curve. In CVPR, 2023.
  11. The douglas-peucker algorithm for line simplification: re-evaluation through visualization. In Computer Graphics Forum, volume 9, pages 213–225. Wiley Online Library, 1990.
  12. Simplification and generalization of large scale data for roads: a comparison of two filtering algorithms. Cartography and Geographic Information Systems, 22(4):264–275, 1995.
  13. Internimage: Exploring large-scale vision foundation models with deformable convolutions. In CVPR, 2023.
  14. Argoverse 2: Next generation datasets for self-driving perception and forecasting. In Neural Information Processing Systems, 2021.
  15. Scene parsing through ade20k dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 633–641, 2017.
  16. Bidirectional feature pyramid network with recurrent attention residual modules for shadow detection. In ECCV, 2018.
Citations (14)

Summary

We haven't generated a summary for this paper yet.