Q-YOLOP: Quantization-aware You Only Look Once for Panoptic Driving Perception (2307.04537v1)
Abstract: In this work, we present an efficient and quantization-aware panoptic driving perception model (Q- YOLOP) for object detection, drivable area segmentation, and lane line segmentation, in the context of autonomous driving. Our model employs the Efficient Layer Aggregation Network (ELAN) as its backbone and task-specific heads for each task. We employ a four-stage training process that includes pretraining on the BDD100K dataset, finetuning on both the BDD100K and iVS datasets, and quantization-aware training (QAT) on BDD100K. During the training process, we use powerful data augmentation techniques, such as random perspective and mosaic, and train the model on a combination of the BDD100K and iVS datasets. Both strategies enhance the model's generalization capabilities. The proposed model achieves state-of-the-art performance with an [email protected] of 0.622 for object detection and an mIoU of 0.612 for segmentation, while maintaining low computational and memory requirements.
- “Yolopv2: Better, faster, stronger for panoptic driving perception,” arXiv, 2022.
- “YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,” arXiv, 2022.
- “Designing network design strategies through gradient path analysis,” arXiv, 2022.
- “Spatial pyramid pooling in deep convolutional networks for visual recognition,” in European Conference on Computer Vision (ECCV), David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars, Eds., 2014.
- “Repvgg: Making vgg-style convnets great again,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- “Hybridnets: End-to-end perception network,” arXiv, 2022.
- “Focal loss for dense object detection,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2980–2988.
- “Tversky loss function for image segmentation using 3d fully convolutional deep networks,” in Machine Learning in Medical Imaging, Qian Wang, Yinghuan Shi, Heung-Il Suk, and Kenji Suzuki, Eds., 2017.
- “Quantization and training of neural networks for efficient integer-arithmetic-only inference,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
- “Bdd100k: A diverse driving video database with scalable annotation tooling,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
- Harold W. Kuhn, “The Hungarian Method for the Assignment Problem,” Naval Research Logistics Quarterly, vol. 2, 1955.
- “Yolov4: Optimal speed and accuracy of object detection,” ArXiv, vol. abs/2004.10934, 2020.
- “Pytorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems 32, pp. 8024–8035. Curran Associates, Inc., 2019.
- “TensorFlow: Large-scale machine learning on heterogeneous systems,” 2015, Software available from tensorflow.org.
- G. Bradski, “The OpenCV Library,” Dr. Dobb’s Journal of Software Tools, 2000.
- “Yolop: You only look once for panoptic driving perception,” Machine Intelligence Research, pp. 1–13, 2022.