A Flying Bird Object Detection Method for Surveillance Video
Abstract: Aiming at the specific characteristics of flying bird objects in surveillance video, such as the typically non-obvious features in single-frame images, small size in most instances, and asymmetric shapes, this paper proposes a Flying Bird Object Detection method for Surveillance Video (FBOD-SV). Firstly, a new feature aggregation module, the Correlation Attention Feature Aggregation (Co-Attention-FA) module, is designed to aggregate the features of the flying bird object according to the bird object's correlation on multiple consecutive frames of images. Secondly, a Flying Bird Object Detection Network (FBOD-Net) with down-sampling followed by up-sampling is designed, which utilizes a large feature layer that fuses fine spatial information and large receptive field information to detect special multi-scale (mostly small-scale) bird objects. Finally, the SimOTA dynamic label allocation method is applied to One-Category object detection, and the SimOTA-OC dynamic label strategy is proposed to solve the difficult problem of label allocation caused by irregular flying bird objects. In this paper, the performance of the FBOD-SV is validated using experimental datasets of flying bird objects in traction substation surveillance videos. The experimental results show that the FBOD-SV effectively improves the detection performance of flying bird objects in surveillance video.
- F. Hoffmann, M. Ritchie, F. Fioranelli, A. Charlish, and H. Griffiths, “Micro-doppler based detection and tracking of uavs with multistatic radar,” in 2016 IEEE Radar Conference (RadarConf), 2016, pp. 1–6.
- M. Jahangir, C. J. Baker, and G. A. Oswald, “Doppler characteristics of micro-drones with l-band multibeam staring radar,” in 2017 IEEE Radar Conference (RadarConf), 2017, pp. 1052–1057.
- T. Ye, J. Zhang, Y. Li, X. Zhang, Z. Zhao, and Z. Li, “Ct-net: An efficient network for low-altitude object detection based on convolution and transformer,” IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1–12, 2022.
- S. Zheng, Z. Wu, Y. Xu, and Z. Wei, “Intrusion detection of foreign objects in overhead power system for preventive maintenance in high-speed railway catenary inspection,” IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1–12, 2022.
- J. Ni, K. Shen, Y. Chen, and S. X. Yang, “An improved ssd-like deep network-based object detection method for indoor scenes,” IEEE Transactions on Instrumentation and Measurement, vol. 72, pp. 1–15, 2023.
- T. Ye, W. Qin, Z. Zhao, X. Gao, X. Deng, and Y. Ouyang, “Real-time object detection network in uav-vision based on cnn and transformer,” IEEE Transactions on Instrumentation and Measurement, vol. 72, pp. 1–13, 2023.
- R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 580–587.
- R. Girshick, “Fast r-cnn,” in 2015 IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1440–1448.
- S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137–1149, 2017.
- W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg, “Ssd: Single shot multibox detector,” in 2016 European Conference on Computer Vision (ECCV), 2016.
- J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 779–788.
- J. Redmon and A. Farhadi, “Yolo9000: Better, faster, stronger,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 6517–6525.
- J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” arXiv e-prints, 2018.
- A. Bochkovskiy, C. Y. Wang, and H. Liao, “Yolov4: Optimal speed and accuracy of object detection,” 2020.
- Y. Contributors, “You only look once version 5,” https://github.com/ultralytics/yolov5, 2021.
- Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “YOLOX: exceeding YOLO series in 2021,” CoRR, vol. abs/2107.08430, 2021. [Online]. Available: https://arxiv.org/abs/2107.08430
- C. Li, L. Li, H. Jiang, K. Weng, Y. Geng, L. Li, Z. Ke, Q. Li, M. Cheng, W. Nie, Y. Li, B. Zhang, Y. Liang, L. Zhou, X. Xu, X. Chu, X. Wei, and X. Wei, “Yolov6: A single-stage object detection framework for industrial applications,” 2022.
- C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,” 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2023. [Online]. Available: http://dx.doi.org/10.1109/CVPR52729.2023.00721
- G. J. N. Ang, A. K. Goil, H. Chan, J. J. Lew, X. C. Lee, R. B. A. Mustaffa, T. Jason, Z. T. Woon, and B. Shen, “A novel application for real-time arrhythmia detection using yolov8,” 2023.
- O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, “Imagenet large scale visual recognition challenge,” International Journal of Computer Vision, vol. 115, no. 3, pp. 211–252, Dec 2015. [Online]. Available: https://doi.org/10.1007/s11263-015-0816-y
- T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in Computer Vision – ECCV 2014, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds. Cham: Springer International Publishing, 2014, pp. 740–755.
- M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, “The pascal visual object classes (voc) challenge,” International Journal of Computer Vision, vol. 88, no. 2, pp. 303–338, Jun 2010. [Online]. Available: https://doi.org/10.1007/s11263-009-0275-4
- T. WU, X. LUO, and Q. XU, “A new skeleton based flying bird detection method for low-altitude air traffic management,” Chinese Journal of Aeronautics, vol. 31, no. 11, pp. 2149–2164, 2018. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1000936118300360
- S. Tian, X. Cao, Y. Li, X. Zhen, and B. Zhang, “Glance and stare: Trapping flying birds in aerial videos by adaptive deep spatio-temporal features,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 9, pp. 2748–2759, 2019.
- Z.-W. Sun, Z.-X. Hua, H.-C. Li, and H.-Y. Zhong, “Flying bird object detection algorithm in surveillance video based on motion information,” IEEE Transactions on Instrumentation and Measurement, vol. 73, pp. 1–15, 2024.
- X. Zhu, Y. Xiong, J. Dai, L. Yuan, and Y. Wei, “Deep feature flow for video recognition,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 4141–4150.
- X. Zhu, Y. Wang, J. Dai, L. Yuan, and Y. Wei, “Flow-guided feature aggregation for video object detection,” in 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 408–417.
- X. Zhu, J. Dai, L. Yuan, and Y. Wei, “Towards high performance video object detection,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 7210–7218.
- C. Hetang, H. Qin, S. Liu, and J. Yan, “Impression network for video object detection,” arXiv, 2017.
- S. Wang, Y. Zhou, J. Yan, and Z. Deng, “Fully motion-aware network for video object detection,” vol. 11217 LNCS, Munich, Germany, 2018, pp. 557 – 573, end to end;End-to-end models;Frame features;Instance-level;Motion pattern;Pixel level;Unified framework;Video object detections;. [Online]. Available: http://dx.doi.org/10.1007/978-3-030-01261-8˙33
- Y. Lu, C. Lu, and C.-K. Tang, “Online video object detection using association lstm,” in 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2363–2371.
- M. Zhu and M. Liu, “Mobile video object detection with temporally-aware feature maps,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 5686–5695.
- H. Luo, L. Huang, H. Shen, Y. Li, C. Huang, and X. Wang, “Object detection in video with spatial-temporal context aggregation,” CoRR, vol. abs/1907.04988, 2019. [Online]. Available: http://arxiv.org/abs/1907.04988
- H. Wu, Y. Chen, N. Wang, and Z.-X. Zhang, “Sequence level semantics aggregation for video object detection,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 9216–9224.
- T. Gong, K. Chen, X. Wang, Q. Chu, F. Zhu, D. Lin, N. Yu, and H. Feng, “Temporal roi align for video object recognition,” in The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21), 2021, pp. 1442–1450.
- L. Han, P. Wang, Z. Yin, F. Wang, and H. Li, “Class-aware feature aggregation network for video object detection,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 12, pp. 8165–8178, 2022.
- C. Xu, J. Zhang, M. Wang, G. Tian, and Y. Liu, “Multilevel spatial-temporal feature aggregation for video object detection,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 11, pp. 7809–7820, 2022.
- H. Deng, Y. Hua, T. Song, Z. Zhang, Z. Xue, R. Ma, N. Robertson, and H. Guan, “Object guided external memory network for video object detection,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 6677–6686.
- M. Fujitake and A. Sugimoto, “Temporal feature enhancement network with external memory for live-stream video object detection,” Pattern Recognition, vol. 131, p. 108847, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0031320322003284
- T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 936–944.
- S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, “Path aggregation network for instance segmentation,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 8759–8768.
- Z. Tian, C. Shen, H. Chen, and T. He, “Fcos: Fully convolutional one-stage object detection,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 9626–9635.
- T. Kong, F. Sun, H. Liu, Y. Jiang, L. Li, and J. Shi, “Foveabox: Beyound anchor-based object detection,” IEEE Transactions on Image Processing, vol. 29, pp. 7389–7398, 2020.
- H. Li, Z. Wu, C. Zhu, C. Xiong, R. Socher, and L. S. Davis, “Learning from noisy anchors for one-stage object detection,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 10 585–10 594.
- K. Kim and H. S. Lee, “Probabilistic anchor assignment with iou prediction for object detection,” Lecture Notes in Computer Science, p. 355–371, 2020. [Online]. Available: http://dx.doi.org/10.1007/978-3-030-58595-2˙22
- B. Zhu, J. Wang, Z. Jiang, F. Zong, S. Liu, Z. Li, and J. Sun, “Autoassign: Differentiable label assignment for dense object detection,” 2020.
- Z. Ge, S. Liu, Z. Li, O. Yoshie, and J. Sun, “Ota: Optimal transport assignment for object detection,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 303–312.
- M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu, “Spatial transformer networks,” vol. 2015-January, Montreal, QC, Canada, 2015, pp. 2017 – 2025, feature map;Input datas;Number of class;Optimisations;Spatially invariants;State-of-the-art performance;.
- C.-Y. Wang, H.-Y. Mark Liao, Y.-H. Wu, P.-Y. Chen, J.-W. Hsieh, and I.-H. Yeh, “Cspnet: A new backbone that can enhance learning capability of cnn,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2020, pp. 1571–1580.
- K. Duan, S. Bai, L. Xie, H. Qi, Q. Huang, and Q. Tian, “Centernet: Keypoint triplets for object detection,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 6568–6577.
- Z. Zheng, P. Wang, D. Ren, W. Liu, R. Ye, Q. Hu, and W. Zuo, “Enhancing geometric factors in model learning and inference for object detection and instance segmentation,” IEEE Transactions on cybernetics, vol. 52, no. 8, pp. 8574–8586, 2021.
- K. Chen, J. Wang, J. Pang, Y. Cao, Y. Xiong, X. Li, S. Sun, W. Feng, Z. Liu, J. Xu, Z. Zhang, D. Cheng, C. Zhu, T. Cheng, Q. Zhao, B. Li, X. Lu, R. Zhu, Y. Wu, J. Dai, J. Wang, J. Shi, W. Ouyang, C. C. Loy, and D. Lin, “MMDetection: Open mmlab detection toolbox and benchmark,” arXiv preprint arXiv:1906.07155, 2019.
- M. Contributors, “MMTracking: OpenMMLab video perception toolbox and benchmark,” https://github.com/open-mmlab/mmtracking, 2020.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.