SemVecNet: Generalizable Vector Map Generation for Arbitrary Sensor Configurations (2405.00250v1)
Abstract: Vector maps are essential in autonomous driving for tasks like localization and planning, yet their creation and maintenance are notably costly. While recent advances in online vector map generation for autonomous vehicles are promising, current models lack adaptability to different sensor configurations. They tend to overfit to specific sensor poses, leading to decreased performance and higher retraining costs. This limitation hampers their practical use in real-world applications. In response to this challenge, we propose a modular pipeline for vector map generation with improved generalization to sensor configurations. The pipeline leverages probabilistic semantic mapping to generate a bird's-eye-view (BEV) semantic map as an intermediate representation. This intermediate representation is then converted to a vector map using the MapTRv2 decoder. By adopting a BEV semantic map robust to different sensor configurations, our proposed approach significantly improves the generalization performance. We evaluate the model on datasets with sensor configurations not used during training. Our evaluation sets includes larger public datasets, and smaller scale private data collected on our platform. Our model generalizes significantly better than the state-of-the-art methods.
- nuscenes: A multimodal dataset for autonomous driving. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11618–11628, Seattle, WA, USA, 13-19 June 2020.
- Openlane-v2: A topology reasoning benchmark for unified 3d hd mapping. In A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors, Advances in Neural Information Processing Systems, volume 36, pages 18873–18884. Curran Associates, Inc., 2023.
- Argoverse 2: Next generation datasets for self-driving perception and forecasting. In Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS Datasets and Benchmarks 2021), online, 06 – 14 December 2021.
- Simmf: Semantics-aware interactive multiagent motion forecasting for autonomous vehicle driving. arXiv preprint arXiv:2306.14941, 2023.
- Openstreetmap-based lidar global localization in urban environment without a prior lidar map. IEEE Robotics and Automation Letters, 7(2):4999–5006, 2022.
- Localization on openstreetmap data using a 3d laser scanner. In 2015 IEEE international conference on robotics and automation (ICRA), pages 5260–5265. IEEE, 2015.
- Autonomous robot navigation based on openstreetmap geodata. In 13th International IEEE Conference on Intelligent Transportation Systems, pages 1645–1650. IEEE, 2010.
- Mapping a hospital using openstreetmap and graphhopper: A navigation system. Bulletin of Electrical Engineering and Informatics, 9(2):661–668, 2020.
- Exploring navigation maps for learning-based motion prediction. arXiv preprint arXiv:2302.06195, 2023.
- Osm vs hd maps: Map representations for trajectory prediction. arXiv preprint arXiv:2311.02305, 2023.
- Automatic construction of lane-level hd maps for urban scenes. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 6649–6656, Prague, Czech Republic, 27 September - 01 October 2021.
- Federal Highway Administration U.S. Department of Transportation. Highway statistics. http://www.fhwa.dot.gov/policyinformation/statistics.cfm, last accessed on Feb 01, 2024.
- Hdmapnet: An online hd map construction and evaluation framework. In 2022 International Conference on Robotics and Automation (ICRA), pages 4628–4634, Philadelphia, PA, USA, 23-27 May 2022.
- Maptr: Structured modeling and learning for online vectorized hd map construction. In International Conference on Learning Representations, 2023.
- Maptrv2: An end-to-end framework for online vectorized hd map construction. arXiv preprint arXiv:2308.05736, 2023.
- Graph-based topology reasoning for driving scenes. arXiv preprint arXiv:2304.05277, 2023.
- Topomlp: A simple yet strong pipeline for driving topology reasoning. arXiv preprint arXiv:2310.06753, 2023.
- Lift, splat, shoot: Encoding images from arbitrary camera rigs by implicitly unprojecting to 3d. arXiv preprint arXiv:2008.05711, 2020.
- Bevformer: Learning bird’s-eye-view representation from multi-camera images via spatiotemporal transformers. arXiv preprint arXiv:2203.17270, 2022.
- Bevfusion: Multi-task multi-sensor fusion with unified bird’s-eye view representation. In IEEE International Conference on Robotics and Automation (ICRA), ExCeL, London, UK, 29 May - 2 June 2023.
- Automatic dense visual semantic mapping from street-level imagery. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 857–862, Vilamoura-Algarve, Portugal, 07-12 Oct 2012.
- Urban 3d semantic modelling using stereo vision. In 2013 IEEE International Conference on Robotics and Automation, pages 580–585, Karlsruhe, Germany, 06-10 May 2013.
- Probabilistic semantic mapping for urban autonomous driving applications. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 2059–2064, Las Vegas, NV, USA, 24 October 2020.
- Vectormapnet: End-to-end vectorized hd map learning. arXiv preprint arXiv:2206.08920, 2023.
- Probabilistic semantic mapping for autonomous driving in urban environments. Sensors, 23(14), 2023.
- Hierarchical multi-scale attention for semantic segmentation. arXiv preprint arXiv:2005.10821, 2020.
- NVIDIA. Tensorrt open source software. https://github.com/NVIDIA/TensorRT, last accessed on Feb 01, 2024.
- Orb-slam3: An accurate open-source library for visual, visual–inertial, and multimap slam. IEEE Trans. Robot., 37(6):1874–1890, 2021.
- Towards high-performance solid-state-lidar-inertial odometry and mapping. IEEE Robotics and Automation Letters, 6(3):5167–5174, 2021.
- Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, Las Vegas, NV, USA, 27-30 June 2016. IEEE.
- Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2):318–327, 2020.
- The mapillary vistas dataset for semantic understanding of street scenes. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 5000–5009, Venice, Italy, 22-29 October 2017.
- Autonomous vehicles for micro-mobility. Auton. Intell. Syst., 1(11):1–35, Nov 2021.
- Ersi. World imagery. https://www.arcgis.com/apps/mapviewer/index.html?layers=10df2279f9684e4a9f6a7f08febac2a9, last accessed on Feb 01, 2024.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.