Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Revolutionizing Traffic Sign Recognition: Unveiling the Potential of Vision Transformers (2404.19066v1)

Published 29 Apr 2024 in cs.CV

Abstract: This research introduces an innovative method for Traffic Sign Recognition (TSR) by leveraging deep learning techniques, with a particular emphasis on Vision Transformers. TSR holds a vital role in advancing driver assistance systems and autonomous vehicles. Traditional TSR approaches, reliant on manual feature extraction, have proven to be labor-intensive and costly. Moreover, methods based on shape and color have inherent limitations, including susceptibility to various factors and changes in lighting conditions. This study explores three variants of Vision Transformers (PVT, TNT, LNL) and six convolutional neural networks (AlexNet, ResNet, VGG16, MobileNet, EfficientNet, GoogleNet) as baseline models. To address the shortcomings of traditional methods, a novel pyramid EATFormer backbone is proposed, amalgamating Evolutionary Algorithms (EAs) with the Transformer architecture. The introduced EA-based Transformer block captures multi-scale, interactive, and individual information through its components: Feed-Forward Network, Global and Local Interaction, and Multi-Scale Region Aggregation modules. Furthermore, a Modulated Deformable MSA module is introduced to dynamically model irregular locations. Experimental evaluations on the GTSRB and BelgiumTS datasets demonstrate the efficacy of the proposed approach in enhancing both prediction speed and accuracy. This study concludes that Vision Transformers hold significant promise in traffic sign classification and contributes a fresh algorithmic framework for TSR. These findings set the stage for the development of precise and dependable TSR algorithms, benefiting driver assistance systems and autonomous vehicles.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. B. Sanyal, R. K. Mohapatra, and R. Dash, “Traffic sign recognition: A survey,” in 2020 International Conference on Artificial Intelligence and Signal Processing (AISP).   IEEE, 2020, pp. 1–6.
  2. M. Mathias, R. Timofte, R. Benenson, and L. Van Gool, “Traffic sign recognition—how far are we from the solution?” in The 2013 international joint conference on Neural networks (IJCNN).   IEEE, 2013, pp. 1–8.
  3. O. N. Manzari, A. Boudesh, and S. B. Shokouhi, “Pyramid transformer for traffic sign detection,” in 2022 12th International Conference on Computer and Knowledge Engineering (ICCKE).   IEEE, 2022, pp. 112–116.
  4. D. Saadati, O. N. Manzari, and S. Mirzakuchaki, “Dilated-unet: A fast and accurate medical image segmentation approach using a dilated transformer and u-net architecture,” arXiv preprint arXiv:2304.11450, 2023.
  5. O. N. Manzari, H. Ahmadabadi, H. Kashiani, S. B. Shokouhi, and A. Ayatollahi, “Medvit: A robust vision transformer for generalized medical image classification,” Computers in Biology and Medicine, vol. 157, p. 106791, 2023.
  6. O. N. Manzari and S. B. Shokouhi, “A robust network for embedded traffic sign recognition,” in 2021 11th International Conference on Computer Engineering and Knowledge (ICCKE).   IEEE, 2021, pp. 447–451.
  7. W. Wang, E. Xie, X. Li, D.-P. Fan, K. Song, D. Liang, T. Lu, P. Luo, and L. Shao, “Pyramid vision transformer: A versatile backbone for dense prediction without convolutions,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 568–578.
  8. K. Han, A. Xiao, E. Wu, J. Guo, C. Xu, and Y. Wang, “Transformer in transformer,” Advances in Neural Information Processing Systems, vol. 34, pp. 15 908–15 919, 2021.
  9. O. N. Manzari, H. Kashiani, H. A. Dehkordi, and S. B. Shokouhi, “Robust transformer with locality inductive bias and feature normalization,” Engineering Science and Technology, an International Journal, vol. 38, p. 101320, 2023.
  10. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Communications of the ACM, vol. 60, no. 6, pp. 84–90, 2017.
  11. M. Rahimzadeh and A. Attar, “A modified deep convolutional neural network for detecting covid-19 and pneumonia from chest x-ray images based on the concatenation of xception and resnet50v2,” Informatics in medicine unlocked, vol. 19, p. 100360, 2020.
  12. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
  13. A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017.
  14. M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” in International conference on machine learning.   PMLR, 2019, pp. 6105–6114.
  15. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1–9.
  16. J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel, “The german traffic sign recognition benchmark: a multi-class classification competition,” in The 2011 international joint conference on neural networks.   IEEE, 2011, pp. 1453–1460.
  17. X. Li, L. Wang, Q. Jiang, and N. Li, “Differential evolution algorithm with multi-population cooperation and multi-strategy integration,” Neurocomputing, vol. 421, pp. 285–302, 2021.
  18. J. Zhang, X. Li, Y. Wang, C. Wang, Y. Yang, Y. Liu, and D. Tao, “Eatformer: improving vision transformer inspired by evolutionary algorithm,” arXiv preprint arXiv:2206.09325, 2022.
  19. A. Farzipour, O. N. Manzari, and S. B. Shokouhi, “Traffic sign recognition using local vision transformer,” in 2023 13th International Conference on Computer and Knowledge Engineering (ICCKE).   IEEE, 2023, pp. 191–196.
  20. X. Zhu, H. Hu, S. Lin, and J. Dai, “Deformable convnets v2: More deformable, better results,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 9308–9316.
  21. Z. Xia, X. Pan, S. Song, L. E. Li, and G. Huang, “Vision transformer with deformable attention,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 4794–4803.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com