Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

From Blurry to Brilliant Detection: YOLOv5-Based Aerial Object Detection with Super Resolution (2401.14661v1)

Published 26 Jan 2024 in cs.CV and cs.LG

Abstract: The demand for accurate object detection in aerial imagery has surged with the widespread use of drones and satellite technology. Traditional object detection models, trained on datasets biased towards large objects, struggle to perform optimally in aerial scenarios where small, densely clustered objects are prevalent. To address this challenge, we present an innovative approach that combines super-resolution and an adapted lightweight YOLOv5 architecture. We employ a range of datasets, including VisDrone-2023, SeaDroneSee, VEDAI, and NWPU VHR-10, to evaluate our model's performance. Our Super Resolved YOLOv5 architecture features Transformer encoder blocks, allowing the model to capture global context and context information, leading to improved detection results, especially in high-density, occluded conditions. This lightweight model not only delivers improved accuracy but also ensures efficient resource utilization, making it well-suited for real-time applications. Our experimental results demonstrate the model's superior performance in detecting small and densely clustered objects, underlining the significance of dataset choice and architectural adaptation for this specific task. In particular, the method achieves 52.5% mAP on VisDrone, exceeding top prior works. This approach promises to significantly advance object detection in aerial imagery, contributing to more accurate and reliable results in a variety of real-world applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. K. Li, G. Wan, G. Cheng, L. Meng, and J. Han, “Object detection in optical remote sensing images: A survey and a new benchmark,” ISPRS journal of photogrammetry and remote sensing, vol. 159, pp. 296–307, 2020.
  2. G.-S. Xia, X. Bai, J. Ding, Z. Zhu, S. Belongie, J. Luo, M. Datcu, M. Pelillo, and L. Zhang, “Dota: A large-scale dataset for object detection in aerial images,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 3974–3983.
  3. M. Mahdianpari, B. Salehi, M. Rezaee, F. Mohammadimanesh, and Y. Zhang, “Very deep convolutional neural networks for complex land cover mapping using multispectral remote sensing imagery,” Remote Sensing, vol. 10, no. 7, p. 1119, 2018.
  4. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13.   Springer, 2014, pp. 740–755.
  5. P. Zhu, L. Wen, D. Du, X. Bian, H. Ling, Q. Hu, Q. Nie, H. Cheng, C. Liu, X. Liu, et al., “Visdrone-det2018: The vision meets drone object detection in image challenge results,” in Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2018.
  6. B.-Y. Liu, H.-X. Chen, Z. Huang, X. Liu, and Y.-Z. Yang, “Zoominnet: A novel small object detector in drone images with cross-scale knowledge distillation,” Remote Sensing, vol. 13, no. 6, p. 1198, 2021.
  7. O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al., “Imagenet large scale visual recognition challenge,” International journal of computer vision, vol. 115, pp. 211–252, 2015.
  8. G. Jocher, A. Stoken, A. Chaurasia, J. Borovec, Y. Kwon, K. Michael, L. Changyu, J. Fang, P. Skalski, A. Hogan, et al., “ultralytics/yolov5: v6. 0-yolov5n’nano’models, roboflow integration, tensorflow export, opencv dnn support,” Zenodo, 2021.
  9. Z. Zou, K. Chen, Z. Shi, Y. Guo, and J. Ye, “Object detection in 20 years: A survey,” Proceedings of the IEEE, 2023.
  10. R. Kaur and S. Singh, “A comprehensive review of object detection with deep learning,” Digital Signal Processing, vol. 132, p. 103812, 2023.
  11. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788.
  12. T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2980–2988.
  13. R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448.
  14. Z. Li, Y. Wang, N. Zhang, Y. Zhang, Z. Zhao, D. Xu, G. Ben, and Y. Gao, “Deep learning-based object detection techniques for remote sensing images: A survey,” Remote Sensing, vol. 14, no. 10, p. 2385, 2022.
  15. C.-Y. Fu, W. Liu, A. Ranga, A. Tyagi, and A. C. Berg, “Dssd: Deconvolutional single shot detector,” arXiv preprint arXiv:1701.06659, 2017.
  16. G. Liu, J. Han, and W. Rong, “Feedback-driven loss function for small object detection,” Image and Vision Computing, vol. 111, p. 104197, 2021.
  17. X. Zhu, S. Lyu, X. Wang, and Q. Zhao, “Tph-yolov5: Improved yolov5 based on transformer prediction head for object detection on drone-captured scenarios,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 2778–2788.
  18. J. Han, D. Zhang, G. Cheng, L. Guo, and J. Ren, “Object detection in optical remote sensing images based on weakly supervised learning and high-level feature learning,” IEEE Transactions on Geoscience and Remote Sensing, vol. 53, no. 6, pp. 3325–3337, 2014.
  19. K. Li, G. Cheng, S. Bu, and X. You, “Rotation-insensitive and context-augmented object detection in remote sensing images,” IEEE Transactions on Geoscience and Remote Sensing, vol. 56, no. 4, pp. 2337–2348, 2017.
  20. M. Qiu, L. Huang, and B.-H. Tang, “Asff-yolov5: Multielement detection method for road traffic in uav images based on multiscale feature fusion,” Remote Sensing, vol. 14, no. 14, p. 3498, 2022.
  21. Z. Su, J. Yu, H. Tan, X. Wan, and K. Qi, “Msa-yolo: A remote sensing object detection model based on multi-scale strip attention,” Sensors, vol. 23, no. 15, p. 6811, 2023.
  22. S. Zeng, W. Yang, Y. Jiao, L. Geng, and X. Chen, “Sca-yolo: a new small object detection model for uav images,” The Visual Computer, pp. 1–17, 2023.
  23. Q. Zhao, B. Liu, S. Lyu, C. Wang, and H. Zhang, “Tph-yolov5++: Boosting object detection on drone-captured scenarios with cross-layer asymmetric transformer,” Remote Sensing, vol. 15, no. 6, p. 1687, 2023.
  24. C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al., “Photo-realistic single image super-resolution using a generative adversarial network,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4681–4690.
  25. X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and C. Change Loy, “Esrgan: Enhanced super-resolution generative adversarial networks,” in Proceedings of the European conference on computer vision (ECCV) workshops, 2018.
  26. L. A. Varga, B. Kiefer, M. Messmer, and A. Zell, “Seadronessee: A maritime benchmark for detecting humans in open water,” in Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2022, pp. 2260–2270.
  27. S. Razakarivony and F. Jurie, “Vehicle detection in aerial imagery: A small target detection benchmark,” Journal of Visual Communication and Image Representation, vol. 34, pp. 187–203, 2016.
  28. G. Cheng, P. Zhou, and J. Han, “Learning rotation-invariant convolutional neural networks for object detection in vhr optical remote sensing images,” IEEE Transactions on Geoscience and Remote Sensing, vol. 54, no. 12, pp. 7405–7415, 2016.
  29. R. Padilla, S. L. Netto, and E. A. Da Silva, “A survey on performance metrics for object-detection algorithms,” in 2020 international conference on systems, signals and image processing (IWSSIP).   IEEE, 2020, pp. 237–242.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Ragib Amin Nihal (5 papers)
  2. Benjamin Yen (6 papers)
  3. Katsutoshi Itoyama (12 papers)
  4. Kazuhiro Nakadai (20 papers)
Citations (1)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets