Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Illicit object detection in X-ray images using Vision Transformers (2403.19043v2)

Published 27 Mar 2024 in cs.CV

Abstract: Illicit object detection is a critical task performed at various high-security locations, including airports, train stations, subways, and ports. The continuous and tedious work of examining thousands of X-ray images per hour can be mentally taxing. Thus, Deep Neural Networks (DNNs) can be used to automate the X-ray image analysis process, improve efficiency and alleviate the security officers' inspection burden. The neural architectures typically utilized in relevant literature are Convolutional Neural Networks (CNNs), with Vision Transformers (ViTs) rarely employed. In order to address this gap, this paper conducts a comprehensive evaluation of relevant ViT architectures on illicit item detection in X-ray images. This study utilizes both Transformer and hybrid backbones, such as SWIN and NextViT, and detectors, such as DINO and RT-DETR. The results demonstrate the remarkable accuracy of the DINO Transformer detector in the low-data regime, the impressive real-time performance of YOLOv8, and the effectiveness of the hybrid NextViT backbone.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Visual inspection for illicit items in X-ray images using deep learning. In Proceedings of the IEEE International Conference on Big Data (BigData), pages 4081–4089. IEEE, 2023.
  2. Illicit item detection in X-ray images for security applications. In Proceedings of the IEEE International Conference on Big Data Computing Service and Applications (BigDataService), pages 63–70. IEEE, 2023.
  3. Cascaded Structure Tensor Framework for robust identification of heavily occluded baggage items from X-ray scans. arXiv preprint arXiv:2004.06780, 2020.
  4. X-ray baggage inspection with computer vision: A survey. IEEE Access, 8:145620–145633, 2020.
  5. Towards automatic threat detection: a survey of advances of deep learning within X-ray security imaging. Pattern Recognition, 122:108245, 2022.
  6. Multimodal eXplainable Artificial Intelligence: A comprehensive review of methodological advances and future research directions. arXiv preprint arXiv:2306.05731, 2023.
  7. Computer vision on X-ray data in industrial production and security applications: A comprehensive survey. IEEE Access, 11:2445–2477, 2023.
  8. ImageNet classification with Deep Convolutional Neural Networks. In Proceedings of the Advances in Neural Information Processing Systems (NIPS), volume 25, 2012.
  9. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021.
  10. Attention is all you need. Proceedings of the Advances in Neural Information Processing Systems (NIPS), 30, 2017.
  11. YOLO-CID: Improved YOLOv7 for X-ray contraband image detection. Electronics, 12(17):3636, 2023.
  12. Occluded prohibited object detection in X-ray images with global context-aware multi-scale feature aggregation. Neurocomputing, 519:1–16, 2023.
  13. LightRay: Lightweight network for prohibited items detection in X-ray images during security inspection. Computers and Electrical Engineering, 103(C), 2022.
  14. Self-supervised visual learning for analyzing firearms trafficking activities on the Web. pages 4071–4080, 2023.
  15. The invisible arms race: digital trends in illicit goods trafficking and AI-enabled responses. Authorea Preprints, 2023.
  16. SWIN transformer: Hierarchical Vision Transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 10012–10022, 2021.
  17. Next-ViT: Next Generation Vision Transformer for efficient deployment in realistic industrial scenarios. arXiv preprint arXiv:2207.05501, 2022.
  18. DINO: DETR with improved denoising anchor boxes for end-to-end object detection. arXiv preprint arXiv:2203.03605, 2022.
  19. DETRs beat YOLOs on real-time object detection. arXiv preprint arXiv:2304.08069, 2023.
  20. Ultralytics YOLOv8. Ultralytics, 2023.
  21. SIXray: A large-scale security inspection X-ray benchmark for prohibited item discovery in overlapping images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2119–2128, 2019.
  22. Transfer learning using Convolutional Neural Networks for object classification within X-ray baggage security imagery. In Proceedings of the IEEE International Conference on Image Processing (ICIP), pages 1057–1061. IEEE, 2016.
  23. Faster R-CNN: Towards real-time object detection with Region Proposal Networks. In Proceedings of the Advances in Neural Information Processing Systems (NIPS), volume 28, 2015.
  24. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pages 2980–2988, 2017.
  25. Focal Loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pages 2999–3007, 2017.
  26. You Only Look Once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 779–788, 2016.
  27. SSD: Single-Shot MultiBox Detector. In Proceedings of the European Conference on Computer Vision (ECCV), pages 21–37. Springer, 2016.
  28. Searching for MobileNetV3. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 1314–1324, 2019.
  29. Feature Pyramid Networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2117–2125, 2017.
  30. CBAM: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision (ECCV), pages 3–19, 2018.
  31. Exploiting foreground and background separation for prohibited item detection in overlapping X-Ray images. Pattern Recognition, 122:108261, 2022.
  32. Pelee: A real-time object detection system on mobile devices. volume 31, 2018.
  33. GhostNet: More features from cheap operations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1580–1589, 2020.
  34. DETRs with collaborative hybrid assignments training. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6748–6758, 2023.
  35. End-to-end object detection with Transformers. In Proceedings of the European Conference on Computer Vision (ECCV), pages 213–229. Springer, 2020.
  36. AG-YOLO: A rapid citrus fruit detection algorithm with Global Context Fusion. Agriculture, 14(1):114, 2024.
  37. Towards more efficient security inspection via deep learning: A task-driven X-ray image cropping scheme. Micromachines, 13(4):565, 2022.
  38. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com