Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Lightweight Object Detection: A Study Based on YOLOv7 Integrated with ShuffleNetv2 and Vision Transformer (2403.01736v1)

Published 4 Mar 2024 in cs.CV

Abstract: As mobile computing technology rapidly evolves, deploying efficient object detection algorithms on mobile devices emerges as a pivotal research area in computer vision. This study zeroes in on optimizing the YOLOv7 algorithm to boost its operational efficiency and speed on mobile platforms while ensuring high accuracy. Leveraging a synergy of advanced techniques such as Group Convolution, ShuffleNetV2, and Vision Transformer, this research has effectively minimized the model's parameter count and memory usage, streamlined the network architecture, and fortified the real-time object detection proficiency on resource-constrained devices. The experimental outcomes reveal that the refined YOLO model demonstrates exceptional performance, markedly enhancing processing velocity while sustaining superior detection accuracy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (14)
  1. Yolov4: Optimal speed and accuracy of object detection. ArXiv, abs/2004.10934, 2020.
  2. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  3. Yolox: Exceeding yolo series in 2021. ArXiv, abs/2107.08430, 2021.
  4. Jocher Glenn. Yolov5 release v6.2. https://github.com/ ultralytics/yolov5/releases/tag/v6.2, 2022.
  5. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
  6. Yolov6: A single-stage object detection framework for industrial applications. ArXiv, abs/2209.02976, 2022.
  7. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European conference on computer vision (ECCV), pages 116–131, 2018.
  8. Yolo9000: Better, faster, stronger. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 6517–6525, 2016.
  9. Yolov3: An incremental improvement. ArXiv, abs/1804.02767, 2018.
  10. You only look once: Unified, real-time object detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 779–788, 2015.
  11. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  12. Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. ArXiv, abs/2207.02696, 2022.
  13. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6848–6856, 2018.
  14. Efficient long-range attention network for image super-resolution. In European Conference on Computer Vision, pages 649–667. Springer, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Wenkai Gong (1 paper)

Summary

We haven't generated a summary for this paper yet.