Better YOLO with Attention-Augmented Network and Enhanced Generalization Performance for Safety Helmet Detection (2405.02591v1)
Abstract: Safety helmets play a crucial role in protecting workers from head injuries in construction sites, where potential hazards are prevalent. However, currently, there is no approach that can simultaneously achieve both model accuracy and performance in complex environments. In this study, we utilized a Yolo-based model for safety helmet detection, achieved a 2% improvement in mAP (mean Average Precision) performance while reducing parameters and Flops count by over 25%. YOLO(You Only Look Once) is a widely used, high-performance, lightweight model architecture that is well suited for complex environments. We presents a novel approach by incorporating a lightweight feature extraction network backbone based on GhostNetv2, integrating attention modules such as Spatial Channel-wise Attention Net(SCNet) and Coordination Attention Net(CANet), and adopting the Gradient Norm Aware optimizer (GAM) for improved generalization ability. In safety-critical environments, the accurate detection and speed of safety helmets plays a pivotal role in preventing occupational hazards and ensuring compliance with safety protocols. This work addresses the pressing need for robust and efficient helmet detection methods, offering a comprehensive framework that not only enhances accuracy but also improves the adaptability of detection models to real-world conditions. Our experimental results underscore the synergistic effects of GhostNetv2, attention modules, and the GAM optimizer, presenting a compelling solution for safety helmet detection that achieves superior performance in terms of accuracy, generalization, and efficiency.
- K. Patel, V. Patel, V. Prajapati, D. Chauhan, A. Haji, and S. Degadwala, “Safety helmet detection using yolo v8,” in 2023 3rd International Conference on Pervasive Computing and Social Networking (ICPCSN), pp. 22–26, IEEE, 2023.
- L. Zhao, D. Zhang, Y. Liu, J. Guo, and Z. Shi, “Improved yolov5s network for multi-scale safety helmet detection,” in 2022 11th International Conference on Communications, Circuits and Systems (ICCCAS), pp. 262–266, IEEE, 2022.
- N. D. T. Yung, W. Wong, F. H. Juwono, and Z. A. Sim, “Safety helmet detection using deep learning: Implementation and comparative study using yolov5, yolov6, and yolov7,” in 2022 International Conference on Green Energy, Computing and Sustainable Technology (GECOST), pp. 164–170, IEEE, 2022.
- Z. Li, W. Xie, L. Zhang, S. Lu, L. Xie, H. Su, W. Du, and W. Hou, “Toward efficient safety helmet detection based on yolov5 with hierarchical positive sample selection and box density filtering,” IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1–14, 2022.
- Y. Tang, K. Han, J. Guo, C. Xu, C. Xu, and Y. Wang, “Ghostnetv2: Enhance cheap operation with long-range attention,” Advances in Neural Information Processing Systems, vol. 35, pp. 9969–9982, 2022.
- R. K. Megalingam, D. H. T. A. Babu, G. Sriram, and V. S. YashwanthAvvari, “Concurrent detection and identification of multiple objects using yolo algorithm,” in 2021 XXIII symposium on image, signal processing and artificial vision (STSIVA), pp. 1–6, IEEE, 2021.
- S. N. Tesema and E.-B. Bourennane, “Denseyolo: Yet faster, lighter and more accurate yolo,” in 2020 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), pp. 0534–0539, IEEE, 2020.
- Z. Niu, G. Zhong, and H. Yu, “A review on the attention mechanism of deep learning,” Neurocomputing, vol. 452, pp. 48–62, 2021.
- Y. Shi and A. Hidaka, “Attention-yolox: Improvement in on-road object detection by introducing attention mechanisms to yolox,” in 2022 International Symposium on Computing and Artificial Intelligence (ISCAI), pp. 5–14, IEEE, 2022.
- S. Zhu and M. Miao, “Scnet: A lightweight and efficient object detection network for remote sensing,” IEEE Geoscience and Remote Sensing Letters, 2023.
- E. Wang, R. Su, B. Huang, and J. Lin, “Enhancing yolov7-based fatigue driving detection through the integration of coordinate attention mechanism,” in 2023 IEEE International Conference on Image Processing and Computer Applications (ICIPCA), pp. 725–729, IEEE, 2023.
- X. Zhang, R. Xu, H. Yu, H. Zou, and P. Cui, “Gradient norm aware minimization seeks first-order flatness and improves generalization,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20247–20257, 2023.
- P. Jiang, D. Ergu, F. Liu, Y. Cai, and B. Ma, “A review of yolo algorithm developments,” Procedia computer science, vol. 199, pp. 1066–1073, 2022.
- Y. Cao, C. Li, Y. Peng, and H. Ru, “Mcs-yolo: A multiscale object detection method for autonomous driving road environment recognition,” IEEE Access, vol. 11, pp. 22342–22354, 2023.
- A. Benjumea, I. Teeti, F. Cuzzolin, and A. Bradley, “Yolo-z: Improving small object detection in yolov5 for autonomous vehicles,” arXiv preprint arXiv:2112.11798, 2021.
- Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “Yolox: Exceeding yolo series in 2021,” arXiv preprint arXiv:2107.08430, 2021.
- M. Liu, Y. Chen, J. Xie, L. He, and Y. Zhang, “Lf-yolo: A lighter and faster yolo for weld defect detection of x-ray image,” IEEE Sensors Journal, vol. 23, no. 7, pp. 7430–7439, 2023.
- X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7794–7803, 2018.
- Z. Chen, R. Cong, Q. Xu, and Q. Huang, “Dpanet: Depth potentiality-aware gated attention network for rgb-d salient object detection,” IEEE Transactions on Image Processing, vol. 30, pp. 7012–7024, 2020.
- C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “Scaled-yolov4: Scaling cross stage partial network,” in Proceedings of the IEEE/cvf conference on computer vision and pattern recognition, pp. 13029–13038, 2021.
- S. Ren, Z. Fang, and X. Gu, “A cross stage partial network with strengthen matching detector for remote sensing object detection,” Remote Sensing, vol. 15, no. 6, p. 1574, 2023.
- J. Jiao, Y.-M. Tang, K.-Y. Lin, Y. Gao, J. Ma, Y. Wang, and W.-S. Zheng, “Dilateformer: Multi-scale dilated transformer for visual recognition,” IEEE Transactions on Multimedia, 2023.
- A. Amer, T. Lambrou, and X. Ye, “Mda-unet: a multi-scale dilated attention u-net for medical image segmentation,” Applied Sciences, vol. 12, no. 7, p. 3676, 2022.
- Z. Liu, J. Ning, Y. Cao, Y. Wei, Z. Zhang, S. Lin, and H. Hu, “Video swin transformer,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3202–3211, 2022.
- J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141, 2018.
- S. Ruder, “An overview of gradient descent optimization algorithms,” arXiv preprint arXiv:1609.04747, 2016.
- T. M. Shami, A. A. El-Saleh, M. Alswaitti, Q. Al-Tashi, M. A. Summakieh, and S. Mirjalili, “Particle swarm optimization: A comprehensive survey,” Ieee Access, vol. 10, pp. 10031–10061, 2022.
- M. Manataki, A. Vafidis, and A. Sarris, “Comparing adam and sgd optimizers to train alexnet for classifying gpr c-scans featuring ancient structures,” in 2021 11th International Workshop on Advanced Ground Penetrating Radar (IWAGPR), pp. 1–6, IEEE, 2021.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
- P. Foret, A. Kleiner, H. Mobahi, and B. Neyshabur, “Sharpness-aware minimization for efficiently improving generalization,” arXiv preprint arXiv:2010.01412, 2020.
- K. Han, Y. Wang, Q. Tian, J. Guo, C. Xu, and C. Xu, “Ghostnet: More features from cheap operations,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1580–1589, 2020.
- S. Spadone, S. Della Penna, C. Sestieri, V. Betti, A. Tosoni, M. G. Perrucci, G. L. Romani, and M. Corbetta, “Dynamic reorganization of human resting-state networks during visuospatial attention,” Proceedings of the National Academy of Sciences, vol. 112, no. 26, pp. 8112–8117, 2015.
- DQR, “Safety helmet detection data.” https://tianchi.aliyun.com/dataset/94696, 2021.
- Z. Cai and N. Vasconcelos, “Cascade r-cnn: Delving into high quality object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6154–6162, 2018.
- R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE international conference on computer vision, pp. 1440–1448, 2015.
- K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask r-cnn,” in Proceedings of the IEEE international conference on computer vision, pp. 2961–2969, 2017.
- W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “Ssd: Single shot multibox detector,” in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp. 21–37, Springer, 2016.
- T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in Proceedings of the IEEE international conference on computer vision, pp. 2980–2988, 2017.
- Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie, “A convnet for the 2020s,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11976–11986, 2022.
- J. Redmon and A. Farhadi, “Yolo9000: Better, faster, stronger,” 2016.
- J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” 2018.
- P. Adarsh, P. Rathi, and M. Kumar, “Yolo v3-tiny: Object detection and recognition using one stage improved model,” in 2020 6th international conference on advanced computing and communication systems (ICACCS), pp. 687–694, IEEE, 2020.
- Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “Yolox: Exceeding yolo series in 2021,” 2021.
- Y. Dai, W. Liu, H. Wang, W. Xie, and K. Long, “Yolo-former: Marrying yolo and transformer for foreign object detection,” IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1–14, 2022.
- S. Tang, Y. Fang, and S. Zhang, “Hic-yolov5: Improved yolov5 for small object detection,” arXiv preprint arXiv:2309.16393, 2023.
- kaggle, “hard-hat-detection data.” https://www.kaggle.com/datasets/andrewmvd/hard-hat-detection, 2020.
- kaggle, “helmet-detection data.” https://www.kaggle.com/datasets/andrewmvd/helmet-detection, 2020.
- kaggle, “yolo-helmethead data.” https://www.kaggle.com/datasets/vodan37/yolo-helmethead, 2021.
- kaggle, “hardhat-vest-dataset-v3 data.” https://www.kaggle.com/datasets/muhammetzahitaydn/hardhat-vest-dataset-v3, 2023.