Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CORP: A Multi-Modal Dataset for Campus-Oriented Roadside Perception Tasks (2404.03191v2)

Published 4 Apr 2024 in cs.CV

Abstract: Numerous roadside perception datasets have been introduced to propel advancements in autonomous driving and intelligent transportation systems research and development. However, it has been observed that the majority of their concentrates is on urban arterial roads, inadvertently overlooking residential areas such as parks and campuses that exhibit entirely distinct characteristics. In light of this gap, we propose CORP, which stands as the first public benchmark dataset tailored for multi-modal roadside perception tasks under campus scenarios. Collected in a university campus, CORP consists of over 205k images plus 102k point clouds captured from 18 cameras and 9 LiDAR sensors. These sensors with different configurations are mounted on roadside utility poles to provide diverse viewpoints within the campus region. The annotations of CORP encompass multi-dimensional information beyond 2D and 3D bounding boxes, providing extra support for 3D seamless tracking and instance segmentation with unique IDs and pixel masks for identifying targets, to enhance the understanding of objects and their behaviors distributed across the campus premises. Unlike other roadside datasets about urban traffic, CORP extends the spectrum to highlight the challenges for multi-modal perception in campuses and other residential areas.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. Continuous emulation and multiscale visualization of traffic flow using stationary roadside sensor data. IEEE Transactions on Intelligent Transportation Systems, 23(8):10530–10541, 2022.
  2. Vips: Real-time perception fusion for infrastructure-assisted autonomous driving. In Proceedings of the 28th Annual International Conference on Mobile Computing And Networking, MobiCom ’22, page 133–146, New York, NY, USA, 2022. Association for Computing Machinery.
  3. A day on campus - an anomaly detection dataset for events in a single camera. In Hiroshi Ishikawa, Cheng-Lin Liu, Tomas Pajdla, and Jianbo Shi, editors, Computer Vision – ACCV 2020, pages 619–635, Cham, 2021. Springer International Publishing.
  4. Dair-v2x: A large-scale dataset for vehicle-infrastructure cooperative 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 21361–21370, June 2022.
  5. Rope3d: The roadside perception dataset for autonomous driving and monocular 3d object detection task. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 21341–21350, June 2022.
  6. A9 intersection dataset: All you need for urban 3d camera-lidar roadside perception, 2023.
  7. Ips300+: a challenging multi-modal data sets for intersection perception system. In 2022 International Conference on Robotics and Automation (ICRA), pages 2539–2545, 2022.
  8. A9-dataset: Multi-sensor infrastructure-based dataset for mobility research. In 2022 IEEE Intelligent Vehicles Symposium (IV), pages 965–970, 2022.
  9. Bevheight: A robust framework for vision-based roadside 3d object detection. In IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), March 2023.
  10. Bevdepth: Acquisition of reliable depth for multi-view 3d object detection, 2022.
  11. Bevformer: Learning bird’s-eye-view representation from multi-camera images via spatiotemporal transformers, 2022.
  12. A revisit of sparse coding based anomaly detection in stacked rnn framework. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 341–349, 2017.
  13. A new comprehensive benchmark for semi-supervised video anomaly detection and anticipation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 20392–20401, June 2023.
  14. Developing and testing robust autonomy: The university of sydney campus data set. IEEE Intelligent Transportation Systems Magazine, 12(4):23–40, 2020.
  15. Campus3d: A photogrammetry point cloud benchmark for hierarchical understanding of outdoor scene. MM ’20, page 238–246, New York, NY, USA, 2020. Association for Computing Machinery.
  16. V2x-seq: A large-scale sequential dataset for vehicle-infrastructure cooperative perception and forecasting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5486–5495, June 2023.
  17. Int2: Interactive trajectory prediction at intersections. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 8536–8547, October 2023.
  18. You only look once: Unified, real-time object detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 779–788, 2016.
  19. Yolo9000: Better, faster, stronger. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 6517–6525, 2017.
  20. Yolov3: An incremental improvement, 2018.
  21. Yolov4: Optimal speed and accuracy of object detection, 2020.
  22. Glenn Jocher. YOLOv5 by Ultralytics, May 2020.
  23. You only learn one representation: Unified network for multiple tasks, 2021.
  24. Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors, 2022.
  25. YOLO by Ultralytics, January 2023.
  26. Modnet: Motion and appearance based moving object detection network for autonomous driving. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pages 2859–2864, 2018.
  27. Monocular instance motion segmentation for autonomous driving: Kitti instancemotseg dataset and multi-task baseline. In 2021 IEEE Intelligent Vehicles Symposium (IV), pages 114–121, 2021.
  28. Learning to segment rigid motions from two frames. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1266–1275, 2021.
  29. Discovering objects that can move. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11779–11788, 2022.
  30. Segmenting moving objects via an object-centric layered representation. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 28023–28036. Curran Associates, Inc., 2022.
  31. Pointpillars: Fast encoders for object detection from point clouds. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12689–12697, 2019.
  32. Voxelnet: End-to-end learning for point cloud based 3d object detection. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4490–4499, 2018.
  33. 3dssd: Point-based 3d single stage object detector. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11037–11045, 2020.
  34. Voxel r-cnn: Towards high performance voxel-based 3d object detection, 2021.
  35. Center-based 3d object detection and tracking. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11779–11788, 2021.
  36. Neighbor-vote: Improving monocular 3d object detection through neighbor distance voting. In Proceedings of the 29th ACM International Conference on Multimedia, MM ’21, page 5239–5247, New York, NY, USA, 2021. Association for Computing Machinery.
  37. Objects are different: Flexible monocular 3d object detection. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3288–3297, 2021.
  38. Imvoxelnet: Image to voxels projection for monocular and multi-view general-purpose 3d object detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2397–2406, 2022.
  39. Monoatt: Online monocular 3d object detection with adaptive token transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 17493–17503, June 2023.
  40. Unimode: Unified monocular 3d object detection, 2024.
  41. Monouni: A unified vehicle and infrastructure-side monocular 3d object detection network with sufficient depth clues. In A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors, Advances in Neural Information Processing Systems, volume 36, pages 11703–11715. Curran Associates, Inc., 2023.
  42. Pointpainting: Sequential fusion for 3d object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4604–4612, 2020.
  43. Bevfusion: Multi-task multi-sensor fusion with unified bird’s-eye view representation. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 2774–2781. IEEE, 2023.
  44. 3d-cvf: Generating joint camera and lidar features using cross-view spatial feature fusion for 3d object detection. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVII 16, pages 720–736. Springer, 2020.
  45. Simple online and realtime tracking. In 2016 IEEE International Conference on Image Processing (ICIP), pages 3464–3468, 2016.
  46. Simple online and realtime tracking with a deep association metric. In 2017 IEEE International Conference on Image Processing (ICIP), pages 3645–3649. IEEE, 2017.
  47. Observation-centric sort: Rethinking sort for robust multi-object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9686–9696, June 2023.
  48. 3d multi-object tracking: A baseline and new evaluation metrics. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 10359–10366, 2020.
  49. Cross-modal 3d object detection and tracking for auto-driving. In Proc. Int. Conf. Intell. Robots Syst, pages 3850–3857. IEEE, 2021.
  50. The MathWorks Inc. Camera calibrator, 2022.
  51. Zhengyou Zhang. A flexible new technique for camera calibration. IEEE Transactions on pattern analysis and machine intelligence, 22(11):1330–1334, 2000.
  52. Segment anything, 2023.
  53. Sustech points: A portable 3d point cloud interactive annotation platform system. In 2020 IEEE Intelligent Vehicles Symposium (IV), pages 1108–1115, 2020.
  54. U2-onet: A two-level nested octave u-structure network with a multi-scale attention mechanism for moving object segmentation. Remote Sensing, 13(1), 2021.
  55. Riwnet: A moving object instance segmentation network being robust in adverse weather conditions, 2021.
  56. Real-time vehicle distance estimation using single view geometry. In 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1100–1109, 2020.
  57. Deep learning based vehicle position and orientation estimation via inverse perspective mapping image. In 2019 IEEE Intelligent Vehicles Symposium (IV), pages 317–323, 2019.
  58. Joint vehicle detection and distance prediction via monocular depth estimation. IET Intelligent Transport Systems, 14(7):753–763, 2020.
  59. Towards generalization across depth for monocular 3d object detection. In Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm, editors, Computer Vision – ECCV 2020, pages 767–782, Cham, 2020. Springer International Publishing.
  60. Towards model generalization for monocular 3d object detection, 2022.
  61. Stereo inverse perspective mapping: theory and applications. Image and Vision Computing, 16(8):585–590, 1998.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com