Common Corruptions for Enhancing and Evaluating Robustness in Air-to-Air Visual Object Detection (2405.06765v2)
Abstract: The main barrier to achieving fully autonomous flights lies in autonomous aircraft navigation. Managing non-cooperative traffic presents the most important challenge in this problem. The most efficient strategy for handling non-cooperative traffic is based on monocular video processing through deep learning models. This study contributes to the vision-based deep learning aircraft detection and tracking literature by investigating the impact of data corruption arising from environmental and hardware conditions on the effectiveness of these methods. More specifically, we designed $7$ types of common corruptions for camera inputs taking into account real-world flight conditions. By applying these corruptions to the Airborne Object Tracking (AOT) dataset we constructed the first robustness benchmark dataset named AOT-C for air-to-air aerial object detection. The corruptions included in this dataset cover a wide range of challenging conditions such as adverse weather and sensor noise. The second main contribution of this letter is to present an extensive experimental evaluation involving $8$ diverse object detectors to explore the degradation in the performance under escalating levels of corruptions (domain shifts). Based on the evaluation results, the key observations that emerge are the following: 1) One-stage detectors of the YOLO family demonstrate better robustness, 2) Transformer-based and multi-stage detectors like Faster R-CNN are extremely vulnerable to corruptions, 3) Robustness against corruptions is related to the generalization ability of models. The third main contribution is to present that finetuning on our augmented synthetic data results in improvements in the generalisation ability of the object detector in real-world flight experiments.
- S. P. Bharati, Y. Wu, Y. Sui, C. Padgett, and G. Wang, “Real-time obstacle detection and tracking for sense-and-avoid mechanism in uavs,” IEEE Transactions on Intelligent Vehicles, vol. 3, no. 2, pp. 185–197, 2018.
- J. James, J. J. Ford, and T. L. Molloy, “Learning to detect aircraft for long-range vision-based sense-and-avoid systems,” IEEE Robotics and Automation Letters, vol. 3, pp. 4383–4390, 2018.
- Z. W. Lee, W. H. Chin, and H. W. Ho, “Air-to-air micro air vehicle interceptor with an embedded mechanism and deep learning,” Aerospace Science and Technology, vol. 135, p. 108192, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1270963823000895
- M. Campion, P. Ranganathan, and S. Faruque, “A review and future directions of uav swarm communication architectures,” in 2018 IEEE international conference on electro/information technology (EIT). IEEE, 2018, pp. 0903–0908.
- Y. Zheng, Z. Chen, D. Lv, Z. Li, Z. Lan, and S. Zhao, “Air-to-air visual detection of micro-uavs: An experimental evaluation of deep learning,” IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 1020–1027, 2021.
- J. Ren and X. Jiang, “Regularized 2-d complex-log spectral analysis and subspace reliability analysis of micro-doppler signature for uav detection,” Pattern Recognition, vol. 69, pp. 225–237, 2017.
- R. Sabatini, A. Gardi, and M. Richardson, “Lidar obstacle warning and avoidance system for unmanned aircraft,” International Journal of Mechanical, Aerospace, Industrial and Mechatronics Engineering, vol. 8, no. 4, pp. 718–729, 2014.
- O. O. Medaiyese, M. Ezuma, A. P. Lauf, and I. Guvenc, “Wavelet transform analytics for rf-based uav detection and identification system using machine learning,” Pervasive and Mobile Computing, vol. 82, p. 101569, 2022.
- “Airborne object tracking dataset,” https://registry.opendata.aws/airborne-object-tracking, accessed: 2023-07-23.
- M. Bijelic, T. Gruber, F. Mannan, F. Kraus, W. Ritter, K. Dietmayer, and F. Heide, “Seeing through fog without seeing fog: Deep multimodal sensor fusion in unseen adverse weather,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11 682–11 692.
- M. Tremblay, S. S. Halder, R. De Charette, and J.-F. Lalonde, “Rain rendering for evaluating and improving robustness to bad weather,” International Journal of Computer Vision, vol. 129, pp. 341–360, 2021.
- Y. Dong, C. Kang, J. Zhang, Z. Zhu, Y. Wang, X. Yang, H. Su, X. Wei, and J. Zhu, “Benchmarking robustness of 3d object detection to common corruptions,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 1022–1032.
- S. Li, Z. Wang, F. Juefei-Xu, Q. Guo, X. Li, and L. Ma, “Common corruption robustness of point cloud detectors: Benchmark and enhancement,” IEEE Transactions on Multimedia, 2023.
- D. Hendrycks and T. Dietterich, “Benchmarking neural network robustness to common corruptions and perturbations,” arXiv preprint arXiv:1903.12261, 2019.
- S. Ben-David, J. Blitzer, K. Crammer, A. Kulesza, F. Pereira, and J. W. Vaughan, “A theory of learning from different domains,” Machine learning, vol. 79, pp. 151–175, 2010.
- R. Opromolla and G. Fasano, “Visual-based obstacle detection and tracking, and conflict detection for small uas sense and avoid,” Aerospace Science and Technology, vol. 119, p. 107167, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1270963821006775
- S. Ghosh, J. Patrikar, B. Moon, M. M. Hamidi, and S. Scherer, “Airtrack: Onboard deep learning framework for long-range aircraft detection and tracking,” 2023.
- A. Arsenos, E. Petrongonas, O. Filippopoulos, C. Skliros, D. Kollias, and S. Kollias, “Nefeli: A deep-learning detection and tracking pipeline for enhancing autonomy in advanced air mobility,” Available at SSRN 4674579.
- S. Wang, R. Veldhuis, and N. Strisciuglio, “The robustness of computer vision models against common corruptions: a survey,” arXiv preprint arXiv:2305.06024, 2023.
- O. F. Kar, T. Yeo, A. Atanov, and A. Zamir, “3d common corruptions and data augmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 18 963–18 974.
- D. Kollias, “Multi-label compound expression recognition: C-expr database & network,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5589–5598.
- D. Kollias, A. Arsenos, and S. Kollias, “A deep neural architecture for harmonizing 3-d input data analysis and decision making in medical imaging,” Neurocomputing, vol. 542, p. 126244, 2023.
- A. Arsenos, D. Kollias, and S. Kollias, “A large imaging database and novel deep neural architecture for covid-19 diagnosis,” in 2022 IEEE 14th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP). IEEE, 2022, p. 1–5.
- D. Kollias, A. Arsenos, and S. Kollias, “Ai-enabled analysis of 3-d ct scans for diagnosis of covid-19 & its severity,” in 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW). IEEE, 2023, p. 1–5.
- D. Kollias, K. Vendal, P. Gadhavi, and S. Russom, “Btdnet: A multi-modal approach for brain tumor radiogenomic classification,” Applied Sciences, vol. 13, no. 21, p. 11984, 2023.
- D. Kollias, A. Psaroudakis, A. Arsenos, and P. Theofilou, “Facernet: a facial expression intensity estimation network,” arXiv preprint arXiv:2303.00180, 2023.
- C. Kamann and C. Rother, “Benchmarking the robustness of semantic segmentation models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 8828–8838.
- J. Wang, S. Jin, W. Liu, W. Liu, C. Qian, and P. Luo, “When human pose estimation meets robustness: Adversarial algorithms and benchmarks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 11 855–11 864.
- M. Chen, Z. Wang, and F. Zheng, “Benchmarks for corruption invariant person re-identification,” arXiv preprint arXiv:2111.00880, 2021.
- D. Kollias, N. Bouas, Y. Vlaxos, V. Brillakis, M. Seferis, I. Kollia, L. Sukissian, J. Wingate, and S. Kollias, “Deep transparent prediction through latent representation analysis,” arXiv preprint arXiv:2009.07044, 2020.
- D. Kollias, Y. Vlaxos, M. Seferis, I. Kollia, L. Sukissian, J. Wingate, and S. D. Kollias, “Transparent adaptation in deep medical image diagnosis.” in TAILOR, 2020, p. 251–267.
- D. Kollias, V. Sharmanska, and S. Zafeiriou, “Distribution matching for multi-task learning of classification tasks: a large-scale study on faces & beyond,” arXiv preprint arXiv:2401.01219, 2024.
- N. Salpea, P. Tzouveli, and D. Kollias, “Medical image segmentation: A review of modern architectures,” in Computer Vision–ECCV 2022 Workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part VII. Springer, 2023, pp. 691–708.
- D. Kollias, G. Marandianos, A. Raouzaiou, and A.-G. Stafylopatis, “Interweaving deep learning and semantic techniques for emotion analysis in human-machine interaction,” in 2015 10th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP). IEEE, 2015, pp. 1–6.
- D. Kollias, S. Cheng, E. Ververas, I. Kotsia, and S. Zafeiriou, “Deep neural network augmentation: Generating faces for affect analysis,” International Journal of Computer Vision, pp. 1–30, 2020.
- D. Kollias and S. P. Zafeiriou, “Exploiting multi-cnn features in cnn-rnn based dimensional emotion recognition on the omg in-the-wild dataset,” IEEE Transactions on Affective Computing, 2020.
- S. Zafeiriou, D. Kollias, M. A. Nicolaou, A. Papaioannou, G. Zhao, and I. Kotsia, “Aff-wild: Valence and arousal ‘in-the-wild’challenge,” in Computer Vision and Pattern Recognition Workshops (CVPRW), 2017 IEEE Conference on. IEEE, 2017, pp. 1980–1987.
- D. Kollias, M. A. Nicolaou, I. Kotsia, G. Zhao, and S. Zafeiriou, “Recognition of affect in the wild using deep neural networks,” in Computer Vision and Pattern Recognition Workshops (CVPRW), 2017 IEEE Conference on. IEEE, 2017, pp. 1972–1979.
- A. Psaroudakis and D. Kollias, “Mixaugment & mixup: Augmentation methods for facial expression recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 2367–2375.
- D. Kollias and S. Zafeiriou, “Expression, affect, action unit recognition: Aff-wild2, multi-task learning and arcface,” arXiv preprint arXiv:1910.04855, 2019.
- D. Kollias, A. Schulc, E. Hajiyev, and S. Zafeiriou, “Analysing affective behavior in the first abaw 2020 competition,” in 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020)(FG). IEEE Computer Society, 2020, pp. 794–800.
- D. Kollias and S. Zafeiriou, “Affect analysis in-the-wild: Valence-arousal, expressions, action units and a unified framework,” arXiv preprint arXiv:2103.15792, 2021.
- D. Kollias, V. Sharmanska, and S. Zafeiriou, “Distribution matching for heterogeneous multi-task learning: a large-scale face study,” arXiv preprint arXiv:2105.03790, 2021.
- D. Kollias, “Abaw: Learning from synthetic data & multi-task learning challenges,” in European Conference on Computer Vision. Springer, 2023, pp. 157–172.
- D. Kollias, A. Arsenos, and S. Kollias, “Domain adaptation, explainability & fairness in ai for medical image analysis: Diagnosis of covid-19 based on 3-d chest ct-scans,” arXiv preprint arXiv:2403.02192, 2024.
- D. Kollias, P. Tzirakis, A. Cowen, S. Zafeiriou, C. Shao, and G. Hu, “The 6th affective behavior analysis in-the-wild (abaw) competition,” arXiv preprint arXiv:2402.19344, 2024.
- D. Kollias, A. Tagaris, A. Stafylopatis, S. Kollias, and G. Tagaris, “Deep neural architectures for prediction in healthcare,” Complex & Intelligent Systems, vol. 4, no. 2, pp. 119–131, 2018.
- A. Arsenos, A. Davidhi, D. Kollias, P. Prassopoulos, and S. Kollias, “Data-driven covid-19 detection through medical imaging,” in 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW). IEEE, 2023, pp. 1–5.
- A. Arsenos, D. Kollias, E. Petrongonas, C. Skliros, and S. Kollias, “Uncertainty-guided contrastive learning for single source domain generalisation,” in ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2024, pp. 6935–6939.
- D. Gerogiannis, A. Arsenos, D. Kollias, D. Nikitopoulos, and S. Kollias, “Covid-19 computer-aided diagnosis through ai-assisted ct imaging analysis: Deploying a medical ai system,” arXiv preprint arXiv:2403.06242, 2024.
- A. Foi, M. Trimeche, V. Katkovnik, and K. Egiazarian, “Practical poissonian-gaussian noise modeling and fitting for single-image raw-data,” IEEE transactions on image processing, vol. 17, no. 10, pp. 1737–1754, 2008.
- G. Jocher, A. Stoken, J. Borovec, NanoCode012, ChristopherSTAN, L. Changyu, Laughing, tkianai, A. Hogan, lorenzomammana, yxNONG, AlexWang1900, L. Diaconu, Marc, wanghaoyang0106, ml5ah, Doug, F. Ingham, Frederik, Guilhen, Hatovix, J. Poznanski, J. Fang, L. Yu, changyu98, M. Wang, N. Gupta, O. Akhtar, PetrDvoracek, and P. Rai, “ultralytics/yolov5: v3.1 - Bug Fixes and Performance Improvements,” Oct. 2020. [Online]. Available: https://doi.org/10.5281/zenodo.4154370
- G. Jocher, A. Chaurasia, and J. Qiu, “Ultralytics YOLO,” Jan. 2023. [Online]. Available: https://github.com/ultralytics/ultralytics
- Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “Yolox: Exceeding yolo series in 2021,” arXiv preprint arXiv:2107.08430, 2021.
- T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” 2018.
- R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448.
- S. Chen, P. Sun, Y. Song, and P. Luo, “Diffusiondet: Diffusion model for object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 19 830–19 843.
- N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” in European conference on computer vision. Springer, 2020, pp. 213–229.
- X. Zhou, V. Koltun, and P. Krähenbühl, “Probabilistic two-stage detection,” in arXiv preprint arXiv:2103.07461, 2021.
- M. C. Keles, B. Salmanoglu, M. S. Guzel, B. Gursoy, and G. E. Bostanci, “Evaluation of yolo models with sliced inference for small object detection,” arXiv preprint arXiv:2203.04799, 2022.