Improved YOLOv5 network for real-time multi-scale traffic sign detection (2112.08782v2)

Published 16 Dec 2021 in cs.CV and cs.LG

Abstract: Traffic sign detection is a challenging task for the unmanned driving system, especially for the detection of multi-scale targets and the real-time problem of detection. In the traffic sign detection process, the scale of the targets changes greatly, which will have a certain impact on the detection accuracy. Feature pyramid is widely used to solve this problem but it might break the feature consistency across different scales of traffic signs. Moreover, in practical application, it is difficult for common methods to improve the detection accuracy of multi-scale traffic signs while ensuring real-time detection. In this paper, we propose an improved feature pyramid model, named AF-FPN, which utilizes the adaptive attention module (AAM) and feature enhancement module (FEM) to reduce the information loss in the process of feature map generation and enhance the representation ability of the feature pyramid. We replaced the original feature pyramid network in YOLOv5 with AF-FPN, which improves the detection performance for multi-scale targets of the YOLOv5 network under the premise of ensuring real-time detection. Furthermore, a new automatic learning data augmentation method is proposed to enrich the dataset and improve the robustness of the model to make it more suitable for practical scenarios. Extensive experimental results on the Tsinghua-Tencent 100K (TT100K) dataset demonstrate the effectiveness and superiority of the proposed method when compared with several state-of-the-art methods.

Citations (184)

View on Semantic Scholar

Summary

The paper presents an improved YOLOv5 network that integrates an adaptive attention module (AF-FPN) and an AutoAugment-inspired strategy to enhance multi-scale traffic sign detection.
It achieved a mean average precision of 65.14% overall and 41.46% on small targets, outperforming baseline models like YOLOX.
The approach balances computational efficiency and detection accuracy, making it suitable for mobile platforms in autonomous driving systems.

Improved YOLOv5 Network for Real-Time Multi-Scale Traffic Sign Detection

The paper presents an enhancement to the YOLOv5 architecture aimed at improving its performance in detecting traffic signs of various scales in real-time on mobile platforms. The authors focus on addressing the challenges associated with scale variance and real-time detection requirements inherent in traffic sign recognition systems crucial for unmanned driving systems and intelligent transportation systems.

Key Contributions

The paper introduces several key innovations to the YOLOv5 network framework:

AF-FPN Integration: The adaptive attention module (AAM) and feature enhancement module (FEM) are introduced within an improved feature pyramid network (AF-FPN), replacing the original feature pyramid in YOLOv5. This modification seeks to minimize information loss during feature map generation and enhance multi-scale target detection accuracy without compromising detection speed.
Automatic Data Augmentation: A new data augmentation strategy inspired by AutoAugment is implemented, optimizing the training dataset to improve model robustness and adapt more effectively to real-world scenarios. The augmented dataset facilitates improved generalization and performance on multi-scale traffic signs.

Experimental Results

Utilizing the Tsinghua-Tencent 100K (TT100K) dataset, the improved YOLOv5 network exhibited notable advancements over existing methods including YOLOv5, Efficientdet, and YOLOX:

Detection Performance: The improved YOLOv5 network achieved a mean average precision (mAP) of 65.14%, indicating a significant improvement in detection accuracy across diverse traffic sign scales. Notably, the recognition accuracy on small targets reached 41.46%, outperforming competitor methods.
Computational Efficiency: With a model size of 16.3M and FLOPs of 17.9G, the improved network maintains a balance between computational resource requirements and detection efficacy, making it suitable for deployment on mobile platforms such as self-driving cars.

Implications and Future Work

The proposed modifications to YOLOv5 optimize its suitability for real-time traffic sign detection systems, particularly in mobile scenarios with limited computational resources. These improvements bear substantial practical relevance in advancing autonomous vehicle technology and enhancing its operational accuracy and reliability in varied traffic conditions.

The focus on minimizing computational demands while improving detection performance hints at promising future research directions to address challenges related to high-speed vehicle motion and image motion blur. Potential advancements could explore further optimizations of network structures to refine detection precision under dynamic conditions.

Conclusion

The paper succeeds in advancing the capabilities of YOLOv5 for real-time traffic sign detection, underscoring key improvements in handling multi-scale targets with adaptive feature extraction and effective augmentation strategies. These contributions hold valuable implications for the development of robust, scalable traffic sign recognition systems integral to the next generation of intelligent driving applications.

PDF Markdown