Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TUMTraf V2X Cooperative Perception Dataset (2403.01316v1)

Published 2 Mar 2024 in cs.CV

Abstract: Cooperative perception offers several benefits for enhancing the capabilities of autonomous vehicles and improving road safety. Using roadside sensors in addition to onboard sensors increases reliability and extends the sensor range. External sensors offer higher situational awareness for automated vehicles and prevent occlusions. We propose CoopDet3D, a cooperative multi-modal fusion model, and TUMTraf-V2X, a perception dataset, for the cooperative 3D object detection and tracking task. Our dataset contains 2,000 labeled point clouds and 5,000 labeled images from five roadside and four onboard sensors. It includes 30k 3D boxes with track IDs and precise GPS and IMU data. We labeled eight categories and covered occlusion scenarios with challenging driving maneuvers, like traffic violations, near-miss events, overtaking, and U-turns. Through multiple experiments, we show that our CoopDet3D camera-LiDAR fusion model achieves an increase of +14.36 3D mAP compared to a vehicle camera-LiDAR fusion model. Finally, we make our dataset, model, labeling tool, and dev-kit publicly available on our website: https://tum-traffic-dataset.github.io/tumtraf-v2x.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. Sane: smart annotation and evaluation tools for point cloud data. IEEE Access, 8:131848–131858, 2020.
  2. Transfusion: Robust lidar-camera fusion for 3d object detection with transformers, 2022a.
  3. Pillargrid: Deep learning-based cooperative perception for 3d object detection from onboard-roadside lidar. In 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), pages 1743–1749. IEEE, 2022b.
  4. Vinet: Lightweight, scalable, and heterogeneous cooperative perception for 3d object detection. Mechanical Systems and Signal Processing, 204:110723, 2023.
  5. Method for registration of 3-d shapes. In Sensor fusion IV: control paradigms and data structures, pages 586–606. Spie, 1992.
  6. Simple online and realtime tracking. In 2016 IEEE international conference on image processing (ICIP), pages 3464–3468. IEEE, 2016.
  7. Lumpi: The leibniz university multi-perspective intersection dataset. In 2022 IEEE Intelligent Vehicles Symposium (IV), pages 1127–1134. IEEE, 2022.
  8. nuscenes: A multimodal dataset for autonomous driving. arXiv preprint arXiv:1903.11027, 2019.
  9. nuscenes: A multimodal dataset for autonomous driving. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11621–11631, 2020.
  10. Rebound: An open-source 3d bounding box annotation tool for active learning. AutomationXP @ CHI 2023, 2023a.
  11. Transiff: An instance-level feature fusion framework for vehicle-infrastructure cooperative 3d detection with transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 18205–18214, 2023b.
  12. MMDetection3D Contributors. MMDetection3D: OpenMMLab next-generation platform for general 3D object detection. https://github.com/open-mmlab/mmdetection3d, 2020.
  13. MMYOLO Contributors. MMYOLO: OpenMMLab YOLO series toolbox and benchmark. https://github.com/open-mmlab/mmyolo, 2022.
  14. Intelligent transportation systems using external infrastructure: A literature survey. https://arxiv.org/pdf/2112.05615.
  15. A9-dataset: Multi-sensor infrastructure-based dataset for mobility research. In 2022 IEEE Intelligent Vehicles Symposium (IV), pages 965–970, 2022.
  16. CARLA: An open urban driving simulator. In Proceedings of the 1st Annual Conference on Robot Learning, pages 1–16, 2017.
  17. Pointcloudlab: An environment for 3d point cloud annotation with adapted visual aids and levels of immersion. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 11640–11646, 2023.
  18. LF AI & Data Foundation. Xtreme1 - the next gen platform for multisensory training data, 2023. Software available from https://github.com/xtreme1-io/xtreme1/.
  19. Vision meets robotics: The kitti dataset. The International Journal of Robotics Research, 32(11):1231–1237, 2013.
  20. Nicco Hagedorn. OpenLABEL Concept Paper.
  21. Vi-eye: Semantic-based 3d point cloud registration for infrastructure-assisted autonomous driving. In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking, pages 573–586, 2021.
  22. Collaboration helps camera overtake lidar in 3d detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9243–9252, 2023.
  23. yolov5. Code repository, 2020.
  24. Ultralytics yolo, 2023.
  25. Providentia-a large-scale sensor system for the assistance of autonomous vehicles and its evaluation. Journal of Field Robotics, pages 1156–1176, 2022.
  26. Deviant: Depth equivariant network for monocular 3d object detection. In European Conference on Computer Vision, pages 664–683. Springer, 2022.
  27. Pointpillars: Fast encoders for object detection from point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12697–12705, 2019.
  28. Sustech points: A portable 3d point cloud interactive annotation platform system. In 2020 IEEE Intelligent Vehicles Symposium (IV), pages 1108–1115, 2020.
  29. Poly-mot: A polyhedral framework for 3d multi-object tracking. arXiv preprint arXiv:2307.16675, 2023.
  30. V2x-sim: Multi-agent collaborative perception dataset and benchmark for autonomous driving. IEEE Robotics and Automation Letters, 7(4):10914–10921, 2022.
  31. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.
  32. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017.
  33. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
  34. Bevfusion: Multi-task multi-sensor fusion with unified bird’s-eye view representation. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 2774–2781. IEEE, 2023.
  35. Microscopic traffic simulation using sumo. In The 21st IEEE International Conference on Intelligent Transportation Systems. IEEE, 2018.
  36. Methods for non-linear least squares problems. 2004.
  37. Cobevfusion: Cooperative perception with lidar-camera bird’s-eye view fusion. arXiv preprint arXiv:2310.06008, 2023.
  38. Ros: an open-source robot operating system. In ICRA workshop on open source software, page 5. Kobe, Japan, 2009.
  39. labelcloud: A lightweight domain-independent labeling tool for 3d object detection in point clouds, 2021.
  40. Ken Shoemake. Animating rotation with quaternion curves. In Proceedings of the 12th annual conference on Computer graphics and interactive techniques, pages 245–254, 1985.
  41. Collaborative semantic occupancy prediction with hybrid feature fusion in connected automated vehicles. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2024.
  42. Scalability in perception for autonomous driving: Waymo open dataset. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2446–2454, 2020.
  43. TorchSparse++: Efficient Point Cloud Engine. In Computer Vision and Pattern Recognition Workshops CVPRW, 2023.
  44. Pointpainting: Sequential fusion for 3d object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4604–4612, 2020.
  45. Latte: accelerating lidar point cloud annotation via sensor fusion, one-click annotation, and tracking. In 2019 IEEE Intelligent Transportation Systems Conference (ITSC), pages 265–272. IEEE, 2019.
  46. Robust asynchronous collaborative 3d detection via bird’s eye view flow. arXiv preprint arXiv:2309.16940, 2023.
  47. Argoverse 2: Next generation datasets for self-driving perception and forecasting. arXiv preprint arXiv:2301.00493, 2023.
  48. Monopgc: Monocular 3d object detection with pixel geometry contexts. preprint arXiv:2302.10549, 2023.
  49. V2x-vit: Vehicle-to-everything cooperative perception with vision transformer. In European conference on computer vision, pages 107–124. Springer, 2022a.
  50. Opv2v: An open benchmark dataset and fusion pipeline for perception with vehicle-to-vehicle communication. In 2022 IEEE International Conference on Robotics and Automation (ICRA), 2022b.
  51. Opv2v: An open benchmark dataset and fusion pipeline for perception with vehicle-to-vehicle communication. In 2022 International Conference on Robotics and Automation (ICRA), pages 2583–2589. IEEE, 2022c.
  52. V2v4real: A real-world large-scale dataset for vehicle-to-vehicle cooperative perception. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13712–13722, 2023.
  53. Cross modal transformer: Towards fast and robust 3d object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 18268–18278, 2023.
  54. Bevheight: A robust framework for vision-based roadside 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21611–21620, 2023.
  55. Rope3d: The roadside perception dataset for autonomous driving and monocular 3d object detection task. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21341–21350, 2022.
  56. Center-based 3d object detection and tracking. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11784–11793, 2021.
  57. Dair-v2x: A large-scale dataset for vehicle-infrastructure cooperative 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21361–21370, 2022.
  58. Vehicle-infrastructure cooperative 3d object detection via feature flow prediction. arXiv preprint arXiv:2303.10552, 2023a.
  59. V2x-seq: A large-scale sequential dataset for vehicle-infrastructure cooperative perception and forecasting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5486–5495, 2023b.
  60. Objects as points. arXiv preprint arXiv:1904.07850, 2019.
  61. Voxelnet: End-to-end learning for point cloud based 3d object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4490–4499, 2018.
  62. 3d bat: A semi-automatic, web-based 3d annotation toolbox for full-surround, multi-modal data streams. In 2019 IEEE Intelligent Vehicles Symposium (IV), pages 1816–1821. IEEE, 2019.
  63. Infradet3d: Multi-modal 3d object detection based on roadside infrastructure camera and lidar sensors. In 2023 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2023a.
  64. Tumtraf intersection dataset: All you need for urban 3d camera-lidar roadside perception. In 2023 IEEE Intelligent Transportation Systems ITSC. IEEE, 2023b.
  65. Real-time and robust 3d object detection with roadside lidars. In Proceedings of the 12th International Scientific Conference on Mobility and Transport: Mobility Innovations for Growing Megacities, pages 199–219. Springer, 2023c.
Citations (20)

Summary

We haven't generated a summary for this paper yet.