M3CAD: Towards Generic Cooperative Autonomous Driving Benchmark

Published 10 May 2025 in cs.RO and cs.CV | (2505.06746v1)

Abstract: We introduce M$^3$CAD, a novel benchmark designed to advance research in generic cooperative autonomous driving. M$^3$CAD comprises 204 sequences with 30k frames, spanning a diverse range of cooperative driving scenarios. Each sequence includes multiple vehicles and sensing modalities, e.g., LiDAR point clouds, RGB images, and GPS/IMU, supporting a variety of autonomous driving tasks, including object detection and tracking, mapping, motion forecasting, occupancy prediction, and path planning. This rich multimodal setup enables M$^3$CAD to support both single-vehicle and multi-vehicle autonomous driving research, significantly broadening the scope of research in the field. To our knowledge, M$^3$CAD is the most comprehensive benchmark specifically tailored for cooperative multi-task autonomous driving research. We evaluate the state-of-the-art end-to-end solution on M$^3$CAD to establish baseline performance. To foster cooperative autonomous driving research, we also propose E2EC, a simple yet effective framework for cooperative driving solution that leverages inter-vehicle shared information for improved path planning. We release M$^3$CAD, along with our baseline models and evaluation results, to support the development of robust cooperative autonomous driving systems. All resources will be made publicly available on https://github.com/zhumorui/M3CAD

Abstract PDF Upgrade to Chat

Summary

M$^3$CAD: A Benchmark for Cooperative Autonomous Driving Research

The academic paper discusses the introduction of M$^3$CAD, a benchmark specifically designed to advance research in cooperative autonomous driving. With 204 sequences and 30,000 frames, M$^3$CAD provides a rich dataset composed of multimodal sensory inputs, including LiDAR, RGB images, and GPS/IMU data. This dataset supports a diverse range of autonomous driving tasks such as object detection, tracking, mapping, motion forecasting, occupancy prediction, and path planning. The core value of M$^3$CAD lies in its ability to support both single-vehicle and multi-vehicle research, providing a comprehensive framework for studying cooperation among autonomous vehicles.

Key Features of M$^3$CAD

The paper emphasizes M$^3$CAD's strengths in overcoming limitations of existing benchmarks, which typically cater to single-vehicle scenarios. Unlike predecessors like KITTI and NuScenes, M$^3$CAD provides scenarios for vehicle collaboration across multiple tasks, utilizing diverse sensor inputs and varied environmental settings. This is achieved through advanced rendering capabilities via Unreal Engine 5 integrated within CARLA. Consequently, M$^3$CAD enables a level of realism and complexity that sets it apart in the field of cooperative driving research.

Novel Contributions

Comprehensiveness: M$^3$CAD stands as a comprehensive benchmark supporting multifaceted collaborations among vehicles. It not only encompasses all single-vehicle tasks but also enables research into cooperative motion forecasting, mapping, and path planning.
Realistic Trajectories: Vehicles in M$^3$CAD navigate complex trajectories in stochastic environmental scenarios. This is in stark contrast to benchmarks assuming simplistic straight-line trajectories, which do not faithfully mimic real-world driving conditions.
Flexible Experimental Setup: It offers researchers tools to tailor scenarios in CARLA based on specific needs, allowing experiments under varied weather conditions and traffic types.

Insights and Implications

M$^3$CAD significantly enhances the potential for deeper insights into CAD systems by fostering exploration of inter-vehicle collaboration. The use of global ground truth facilitates robust model training, ultimately leading to enhanced safety and efficiency in path planning. This, alongside effective fusion of multi-modal sensory data, is essential for achieving realistic autonomous driving solutions. Furthermore, the accompanying E2EC framework offers a modular approach for researchers to investigate the performance of isolated tasks within cooperative settings.

In experimental insights, the application of UniAD, a state-of-the-art end-to-end solution, on M$^3$CAD, showcases marked improvements over traditional datasets. The cooperative framework demonstrated a reduction in trajectory error and collision probability, accentuating the importance of vehicle collaboration for improved perception and planning outcomes.

Limitations and Future Work

The paper notes limitations in simulating large-scale vehicle scenarios, given computational constraints associated with GPU memory. Moreover, while extensive, the dataset currently focuses predominantly on vehicle interactions, thereby not addressing other traffic participants comprehensively. Addressing these gaps could drive further iterations in the CAD research and expand the dataset's applicability.

In conclusion, M$^3$CAD's introduction marks a critical advancement in CAD research potential. Its expansive modality and task support create conducive ground for modeling realistic automotive scenarios, driving further exploration into cooperative frameworks, and bolstering research toward robust autonomous systems.