Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enhanced Self-Checkout System for Retail Based on Improved YOLOv10 (2407.21308v2)

Published 31 Jul 2024 in cs.CV

Abstract: With the rapid advancement of deep learning technologies, computer vision has shown immense potential in retail automation. This paper presents a novel self-checkout system for retail based on an improved YOLOv10 network, aimed at enhancing checkout efficiency and reducing labor costs. We propose targeted optimizations to the YOLOv10 model, by incorporating the detection head structure from YOLOv8, which significantly improves product recognition accuracy. Additionally, we develop a post-processing algorithm tailored for self-checkout scenarios, to further enhance the application of system. Experimental results demonstrate that our system outperforms existing methods in both product recognition accuracy and checkout speed. This research not only provides a new technical solution for retail automation but offers valuable insights into optimizing deep learning models for real-world applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. K. Oosthuizen, E. Botha, J. Robertson, and M. Montecchi, “Artificial intelligence in retail: The ai-enabled value chain,” Australasian Marketing Journal, vol. 29, no. 3, pp. 264–273, 2021.
  2. Q. Zheng, C. Yu, J. Cao, Y. Xu, Q. Xing, and Y. Jin, “Advanced payment security system: Xgboost, catboost and smote integrated,” arXiv preprint arXiv:2406.04658, 2024.
  3. J. Liu, I. Huang, A. Anand, P.-H. Chang, and Y. Huang, “Digital twin in retail: An ai-driven multi-modal approach for real-time product recognition and 3d store reconstruction,” in 2024 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW).   IEEE, 2024, pp. 368–373.
  4. Z. Lin, C. Wang, Z. Li, Z. Wang, X. Liu, and Y. Zhu, “Neural radiance fields convert 2d to 3d texture,” Applied Science and Biotechnology Journal for Advanced Research, vol. 3, no. 3, pp. 40–44, 2024.
  5. F. Liu, X. Wang, Q. Chen, J. Liu, and C. Liu, “Siamman: Siamese multi-phase aware network for real-time unmanned aerial vehicle tracking,” Drones, vol. 7, no. 12, p. 707, 2023.
  6. H. Mokayed, T. Z. Quan, L. Alkhaled, and V. Sivakumar, “Real-time human detection and counting system using deep learning computer vision techniques,” in Artificial Intelligence and Applications, vol. 1, no. 4, 2023, pp. 221–229.
  7. C. He, K. Li, Y. Zhang, L. Tang, Y. Zhang, Z. Guo, and X. Li, “Camouflaged object detection with feature decomposition and edge reconstruction,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 22 046–22 055.
  8. H. Li, R. Zhang, Y. Pan, J. Ren, and F. Shen, “Lr-fpn: Enhancing remote sensing object detection with location refined feature pyramid network,” arXiv preprint arXiv:2404.01614, 2024.
  9. Y. Wei, S. Tran, S. Xu, B. Kang, and M. Springer, “Deep learning for retail product recognition: Challenges and techniques,” Computational intelligence and neuroscience, vol. 2020, no. 1, p. 8875910, 2020.
  10. B. Dang, W. Zhao, Y. Li, D. Ma, Q. Yu, and E. Y. Zhu, “Real-time pill identification for the visually impaired using deep learning,” arXiv preprint arXiv:2405.05983, 2024.
  11. A. Tarvainen and H. Valpola, “Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results,” Advances in neural information processing systems, vol. 30, 2017.
  12. Y. Duan, Z. Zhao, L. Qi, L. Wang, L. Zhou, Y. Shi, and Y. Gao, “Mutexmatch: semi-supervised learning with mutex-based consistency regularization,” IEEE Transactions on Neural Networks and Learning Systems, 2022.
  13. J. Cruz-Mota, I. Bogdanova, B. Paquier, M. Bierlaire, and J.-P. Thiran, “Scale invariant feature transform on the sphere: Theory and applications,” International journal of computer vision, vol. 98, pp. 217–241, 2012.
  14. H. Tokunaga, Y. Teramoto, A. Yoshizawa, and R. Bise, “Adaptive weighting multi-field-of-view cnn for semantic segmentation in pathology,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 12 597–12 606.
  15. Y. Jin, “Graphcnnpred: A stock market indices prediction using a graph based deep learning system,” arXiv preprint arXiv:2407.03760, 2024.
  16. Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, “Distance-iou loss: Faster and better learning for bounding box regression,” in Proceedings of the AAAI conference on artificial intelligence, vol. 34, no. 07, 2020, pp. 12 993–13 000.
  17. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788.
  18. J. Yu and W. Zhang, “Face mask wearing detection algorithm based on improved yolo-v4,” Sensors, vol. 21, no. 9, p. 3263, 2021.
  19. D. Ma, S. Li, B. Dang, H. Zang, and X. Dong, “Fostc3net: A lightweight yolov5 based on the network structure optimization,” arXiv preprint arXiv:2403.13703, 2024.
  20. Y. Zhang, Z. Guo, J. Wu, Y. Tian, H. Tang, and X. Guo, “Real-time vehicle detection based on improved yolo v5,” Sustainability, vol. 14, no. 19, p. 12274, 2022.
  21. M. Hussain, “Yolov5, yolov8 and yolov10: The go-to detectors for real-time vision,” arXiv preprint arXiv:2407.02988, 2024.
  22. Y. Swathi and M. Challa, “Yolov8: Advancements and innovations in object detection,” in International Conference on Smart Computing and Communication.   Springer, 2024, pp. 1–13.
  23. H. Lou, X. Duan, J. Guo, H. Liu, J. Gu, L. Bi, and H. Chen, “Dc-yolov8: small-size object detection algorithm based on camera sensor,” Electronics, vol. 12, no. 10, p. 2323, 2023.
  24. A. Wang, H. Chen, L. Liu, K. Chen, Z. Lin, J. Han, and G. Ding, “Yolov10: Real-time end-to-end object detection,” arXiv preprint arXiv:2405.14458, 2024.
  25. X. Zhong, X. Liu, T. Gong, Y. Sun, H. Hu, and Q. Liu, “Fagd-net: Feature-augmented grasp detection network based on efficient multi-scale attention and fusion mechanisms,” Applied Sciences, vol. 14, no. 12, p. 5097, 2024.
  26. L. H. Pham, D. N.-N. Tran, H.-H. Nguyen, T. H.-P. Tran, H.-J. Jeon, H.-M. Jeon, and J. W. Jeon, “Deepaco: A robust deep learning-based automatic checkout system,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 3107–3114.
  27. D. Ouyang, S. He, G. Zhang, M. Luo, H. Guo, J. Zhan, and Z. Huang, “Efficient multi-scale attention module with cross-spatial learning,” in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).   IEEE, 2023, pp. 1–5.
Citations (14)

Summary

We haven't generated a summary for this paper yet.