Real-time Traffic Object Detection for Autonomous Driving (2402.00128v2)
Abstract: With recent advances in computer vision, it appears that autonomous driving will be part of modern society sooner rather than later. However, there are still a significant number of concerns to address. Although modern computer vision techniques demonstrate superior performance, they tend to prioritize accuracy over efficiency, which is a crucial aspect of real-time applications. Large object detection models typically require higher computational power, which is achieved by using more sophisticated onboard hardware. For autonomous driving, these requirements translate to increased fuel costs and, ultimately, a reduction in mileage. Further, despite their computational demands, the existing object detectors are far from being real-time. In this research, we assess the robustness of our previously proposed, highly efficient pedestrian detector LSFM on well-established autonomous driving benchmarks, including diverse weather conditions and nighttime scenes. Moreover, we extend our LSFM model for general object detection to achieve real-time object detection in traffic scenes. We evaluate its performance, low latency, and generalizability on traffic object detection datasets. Furthermore, we discuss the inadequacy of the current key performance indicator employed by object detection systems in the context of autonomous driving and propose a more suitable alternative that incorporates real-time requirements.
- “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 580–587.
- “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009, pp. 248–255.
- “Cascade r-cnn: Delving into high quality object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 6154–6162.
- “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788.
- “Yolov3: An incremental improvement,” arXiv preprint arXiv:1804.02767, 2018.
- “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
- “Dynamic head: Unifying object detection heads with attentions,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 7373–7382.
- “End-to-end object detection with transformers,” in European conference on computer vision. Springer, 2020, pp. 213–229.
- “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10012–10022.
- “Fast convergence of detr with spatially modulated co-attention,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 3621–3630.
- “Cornernet: Detecting objects as paired keypoints,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 734–750.
- “Centernet: Keypoint triplets for object detection,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 6569–6578.
- “Fcos: Fully convolutional one-stage object detection,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 9627–9636.
- “F2dnet: Fast focal detection network for pedestrian detection,” in 2022 26th International Conference on Pattern Recognition (ICPR). IEEE, 2022, pp. 4658–4664.
- “Localized semantic feature mixers for efficient pedestrian detection in autonomous driving,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5476–5485.
- “Vision meets robotics: The kitti dataset,” International Journal of Robotics Research (IJRR), 2013.
- “Faster r-cnn: Towards real-time object detection with region proposal networks,” Advances in neural information processing systems, vol. 28, 2015.
- “Ssd: Single shot multibox detector,” in European conference on computer vision. Springer, 2016, pp. 21–37.
- “Focal loss for dense object detection,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2980–2988.
- “Monocular pedestrian orientation estimation based on deep 2d-3d feedforward,” Pattern Recognition, vol. 100, pp. 107182, 2020.
- “High-level semantic networks for multi-scale object detection,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 10, pp. 3372–3386, 2019.
- “Enhanced object detection with deep convolutional neural networks for advanced driving assistance,” IEEE transactions on intelligent transportation systems, vol. 21, no. 4, pp. 1572–1583, 2019.
- “Eurocity persons: A novel benchmark for person detection in traffic scenes,” IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 8, pp. 1844–1861, 2019.
- “Accurate single stage detector using recurrent rolling convolution,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 5420–5428.
- “Exploit all the layers: Fast and accurate cnn object detector with scale dependent pooling and cascaded rejection classifiers,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2129–2137.
- “Sp-nas: Serial-to-parallel backbone search for object detection,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 11863–11872.
- “Generalizable pedestrian detection: The elephant in the room,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 11328–11337.
- “nuscenes: A multimodal dataset for autonomous driving,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 11621–11631.
- “Bdd100k: A diverse driving dataset for heterogeneous multitask learning,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 2636–2645.
- “Shift: a synthetic driving dataset for continuous multi-task domain adaptation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 21371–21382.
- “Tju-dhd: A diverse high-resolution dataset for object detection,” IEEE Transactions on Image Processing, vol. 30, pp. 207–219, 2020.
- “Pedestrian detection: An evaluation of the state of the art,” IEEE transactions on pattern analysis and machine intelligence, vol. 34, no. 4, pp. 743–761, 2011.