VAD: Vectorized Scene Representation for Efficient Autonomous Driving (2303.12077v3)

Published 21 Mar 2023 in cs.RO and cs.CV

Abstract: Autonomous driving requires a comprehensive understanding of the surrounding environment for reliable trajectory planning. Previous works rely on dense rasterized scene representation (e.g., agent occupancy and semantic map) to perform planning, which is computationally intensive and misses the instance-level structure information. In this paper, we propose VAD, an end-to-end vectorized paradigm for autonomous driving, which models the driving scene as a fully vectorized representation. The proposed vectorized paradigm has two significant advantages. On one hand, VAD exploits the vectorized agent motion and map elements as explicit instance-level planning constraints which effectively improves planning safety. On the other hand, VAD runs much faster than previous end-to-end planning methods by getting rid of computation-intensive rasterized representation and hand-designed post-processing steps. VAD achieves state-of-the-art end-to-end planning performance on the nuScenes dataset, outperforming the previous best method by a large margin. Our base model, VAD-Base, greatly reduces the average collision rate by 29.0% and runs 2.5x faster. Besides, a lightweight variant, VAD-Tiny, greatly improves the inference speed (up to 9.3x) while achieving comparable planning performance. We believe the excellent performance and the high efficiency of VAD are critical for the real-world deployment of an autonomous driving system. Code and models are available at https://github.com/hustvl/VAD for facilitating future research.

PDF Abstract

Vectorized Scene Representation for Autonomous Driving: An Analysis of VAD

The paper "VAD: Vectorized Scene Representation for Efficient Autonomous Driving" proposes an end-to-end paradigm for autonomous driving that leverages a fully vectorized scene representation. This approach addresses limitations of previous methods that rely heavily on computationally intensive rasterized scene representations, such as semantic and occupancy maps, which lack instance-level structural information.

Overview and Methodology

VAD distinguishes itself by representing both the map and traffic agent motion in a vectorized form, providing significant advantages in efficiency and planning safety. The methodology involves several key components:

Vectorized Scene Learning: The scene is represented through vectorized map elements and traffic participants' motion vectors. This approach not only reduces computational overhead but also enhances interpretability at the instance level.
Ego-Agent and Ego-Map Interaction: These interactions are designed to capture dynamic and static scene attributes, aiding in the derivation of informative scene features that are crucial for planning.
Planning: Based on an enriched understanding of the environment through vectorized constraints, VAD aims to produce safer planning trajectories by integrating both implicit and explicit scene information.

Numerical Results and Performance

VAD demonstrates state-of-the-art performance on the nuScenes dataset, significantly outperforming previous methods. The base model, VAD-Base, reduces the average collision rate by 29.0% and improves inference speed by 2.5 times compared to leading approaches, while a lighter variant, VAD-Tiny, offers even greater speed improvements (up to 9.3 times) with maintained performance metrics.

These numerical outcomes underscore the efficacy of using vectorized representations in reducing computational demands and enhancing planning accuracy, which is crucial for real-time deployment in real-world scenarios.

Implications and Future Directions

The application of vectorized representation in autonomous driving paradigms suggests a shift towards more efficient and interpretable models. By leveraging instance-level information, VAD not only enhances planning performance but also posits a foundation for further research in this domain. Future investigations could explore multi-modality motion prediction utilization and integration of additional traffic information, such as lane graphs and road signals, into the vectorized framework.

Overall, the work sets a new direction for autonomous driving research, emphasizing the significance of transitioning to vectorized paradigms that offer both theoretical and practical benefits. This could lead to more robust, scalable, and deployable autonomous driving systems in the future.

PDF Markdown Bookmark Chat (Pro)

Authors (10)

Bo Jiang (235 papers)
Shaoyu Chen (26 papers)
Qing Xu (71 papers)
Bencheng Liao (20 papers)
Jiajie Chen (31 papers)
Helong Zhou (9 papers)
Qian Zhang (308 papers)
Wenyu Liu (146 papers)
Chang Huang (46 papers)
Xinggang Wang (163 papers)

Citations (122)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - hustvl/VAD: [ICCV 2023] VAD: Vectorized Scene Representation for Efficient Autonomous Driving (613 stars)