Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Q-YOLOP: Quantization-aware You Only Look Once for Panoptic Driving Perception (2307.04537v1)

Published 10 Jul 2023 in cs.CV and cs.AI

Abstract: In this work, we present an efficient and quantization-aware panoptic driving perception model (Q- YOLOP) for object detection, drivable area segmentation, and lane line segmentation, in the context of autonomous driving. Our model employs the Efficient Layer Aggregation Network (ELAN) as its backbone and task-specific heads for each task. We employ a four-stage training process that includes pretraining on the BDD100K dataset, finetuning on both the BDD100K and iVS datasets, and quantization-aware training (QAT) on BDD100K. During the training process, we use powerful data augmentation techniques, such as random perspective and mosaic, and train the model on a combination of the BDD100K and iVS datasets. Both strategies enhance the model's generalization capabilities. The proposed model achieves state-of-the-art performance with an [email protected] of 0.622 for object detection and an mIoU of 0.612 for segmentation, while maintaining low computational and memory requirements.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (16)
  1. “Yolopv2: Better, faster, stronger for panoptic driving perception,” arXiv, 2022.
  2. “YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,” arXiv, 2022.
  3. “Designing network design strategies through gradient path analysis,” arXiv, 2022.
  4. “Spatial pyramid pooling in deep convolutional networks for visual recognition,” in European Conference on Computer Vision (ECCV), David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars, Eds., 2014.
  5. “Repvgg: Making vgg-style convnets great again,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  6. “Hybridnets: End-to-end perception network,” arXiv, 2022.
  7. “Focal loss for dense object detection,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2980–2988.
  8. “Tversky loss function for image segmentation using 3d fully convolutional deep networks,” in Machine Learning in Medical Imaging, Qian Wang, Yinghuan Shi, Heung-Il Suk, and Kenji Suzuki, Eds., 2017.
  9. “Quantization and training of neural networks for efficient integer-arithmetic-only inference,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
  10. “Bdd100k: A diverse driving video database with scalable annotation tooling,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
  11. Harold W. Kuhn, “The Hungarian Method for the Assignment Problem,” Naval Research Logistics Quarterly, vol. 2, 1955.
  12. “Yolov4: Optimal speed and accuracy of object detection,” ArXiv, vol. abs/2004.10934, 2020.
  13. “Pytorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems 32, pp. 8024–8035. Curran Associates, Inc., 2019.
  14. “TensorFlow: Large-scale machine learning on heterogeneous systems,” 2015, Software available from tensorflow.org.
  15. G. Bradski, “The OpenCV Library,” Dr. Dobb’s Journal of Software Tools, 2000.
  16. “Yolop: You only look once for panoptic driving perception,” Machine Intelligence Research, pp. 1–13, 2022.
Citations (12)

Summary

We haven't generated a summary for this paper yet.