Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Ensuring UAV Safety: A Vision-only and Real-time Framework for Collision Avoidance Through Object Detection, Tracking, and Distance Estimation (2405.06749v2)

Published 10 May 2024 in cs.CV and cs.LG

Abstract: In the last twenty years, unmanned aerial vehicles (UAVs) have garnered growing interest due to their expanding applications in both military and civilian domains. Detecting non-cooperative aerial vehicles with efficiency and estimating collisions accurately are pivotal for achieving fully autonomous aircraft and facilitating Advanced Air Mobility (AAM). This paper presents a deep-learning framework that utilizes optical sensors for the detection, tracking, and distance estimation of non-cooperative aerial vehicles. In implementing this comprehensive sensing framework, the availability of depth information is essential for enabling autonomous aerial vehicles to perceive and navigate around obstacles. In this work, we propose a method for estimating the distance information of a detected aerial object in real time using only the input of a monocular camera. In order to train our deep learning components for the object detection, tracking and depth estimation tasks we utilize the Amazon Airborne Object Tracking (AOT) Dataset. In contrast to previous approaches that integrate the depth estimation module into the object detector, our method formulates the problem as image-to-image translation. We employ a separate lightweight encoder-decoder network for efficient and robust depth estimation. In a nutshell, the object detection module identifies and localizes obstacles, conveying this information to both the tracking module for monitoring obstacle movement and the depth estimation module for calculating distances. Our approach is evaluated on the Airborne Object Tracking (AOT) dataset which is the largest (to the best of our knowledge) air-to-air airborne object dataset.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. A. Mcfadyen and L. Mejias, “A survey of autonomous vision-based see and avoid for unmanned aircraft systems,” Progress in Aerospace Sciences, vol. 80, pp. 1–17, 2016.
  2. Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “Yolox: Exceeding yolo series in 2021,” arXiv preprint arXiv:2107.08430, 2021.
  3. G. Jocher, A. Stoken, J. Borovec, NanoCode012, ChristopherSTAN, L. Changyu, Laughing, tkianai, A. Hogan, lorenzomammana, yxNONG, AlexWang1900, L. Diaconu, Marc, wanghaoyang0106, ml5ah, Doug, F. Ingham, Frederik, Guilhen, Hatovix, J. Poznanski, J. Fang, L. Yu, changyu98, M. Wang, N. Gupta, O. Akhtar, PetrDvoracek, and P. Rai, “ultralytics/yolov5: v3.1 - Bug Fixes and Performance Improvements,” Oct. 2020. [Online]. Available: https://doi.org/10.5281/zenodo.4154370
  4. N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” in European conference on computer vision.   Springer, 2020, pp. 213–229.
  5. S. Chen, P. Sun, Y. Song, and P. Luo, “Diffusiondet: Diffusion model for object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 19 830–19 843.
  6. R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448.
  7. Y. Du, Z. Zhao, Y. Song, Y. Zhao, F. Su, T. Gong, and H. Meng, “Strongsort: Make deepsort great again,” IEEE Transactions on Multimedia, 2023.
  8. X. Dong, M. A. Garratt, S. G. Anavatti, and H. A. Abbass, “Towards real-time monocular depth estimation for robotics: A survey,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 10, pp. 16 940–16 961, 2022.
  9. X. Ye, S. Chen, and R. Xu, “Dpnet: Detail-preserving network for high quality monocular depth estimation,” Pattern Recognition, vol. 109, p. 107578, 2021.
  10. T. Shimada, H. Nishikawa, X. Kong, and H. Tomiyama, “Pix2pix-based monocular depth estimation for drones with optical flow on airsim,” Sensors, vol. 22, no. 6, p. 2097, 2022.
  11. D. Kollias, “Multi-label compound expression recognition: C-expr database & network,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5589–5598.
  12. D. Kollias, A. Arsenos, and S. Kollias, “A deep neural architecture for harmonizing 3-d input data analysis and decision making in medical imaging,” Neurocomputing, vol. 542, p. 126244, 2023.
  13. A. Arsenos, D. Kollias, and S. Kollias, “A large imaging database and novel deep neural architecture for covid-19 diagnosis,” in 2022 IEEE 14th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP).   IEEE, 2022, p. 1–5.
  14. K. Karsch, C. Liu, and S. B. Kang, “Depth transfer: Depth extraction from video using non-parametric sampling,” IEEE transactions on pattern analysis and machine intelligence, vol. 36, no. 11, pp. 2144–2158, 2014.
  15. X. Dong, M. A. Garratt, S. G. Anavatti, and H. A. Abbass, “Mobilexnet: An efficient convolutional neural network for monocular depth estimation,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 11, pp. 20 134–20 147, 2022.
  16. D. Kollias, A. Arsenos, and S. Kollias, “Ai-enabled analysis of 3-d ct scans for diagnosis of covid-19 & its severity,” in 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW).   IEEE, 2023, p. 1–5.
  17. D. Kollias, K. Vendal, P. Gadhavi, and S. Russom, “Btdnet: A multi-modal approach for brain tumor radiogenomic classification,” Applied Sciences, vol. 13, no. 21, p. 11984, 2023.
  18. D. Kollias, A. Psaroudakis, A. Arsenos, and P. Theofilou, “Facernet: a facial expression intensity estimation network,” arXiv preprint arXiv:2303.00180, 2023.
  19. D. Kollias, N. Bouas, Y. Vlaxos, V. Brillakis, M. Seferis, I. Kollia, L. Sukissian, J. Wingate, and S. Kollias, “Deep transparent prediction through latent representation analysis,” arXiv preprint arXiv:2009.07044, 2020.
  20. D. Kollias, Y. Vlaxos, M. Seferis, I. Kollia, L. Sukissian, J. Wingate, and S. D. Kollias, “Transparent adaptation in deep medical image diagnosis.” in TAILOR, 2020, p. 251–267.
  21. D. Kollias, V. Sharmanska, and S. Zafeiriou, “Distribution matching for multi-task learning of classification tasks: a large-scale study on faces & beyond,” arXiv preprint arXiv:2401.01219, 2024.
  22. N. Salpea, P. Tzouveli, and D. Kollias, “Medical image segmentation: A review of modern architectures,” in Computer Vision–ECCV 2022 Workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part VII.   Springer, 2023, pp. 691–708.
  23. D. Kollias, G. Marandianos, A. Raouzaiou, and A.-G. Stafylopatis, “Interweaving deep learning and semantic techniques for emotion analysis in human-machine interaction,” in 2015 10th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP).   IEEE, 2015, pp. 1–6.
  24. D. Kollias, S. Cheng, E. Ververas, I. Kotsia, and S. Zafeiriou, “Deep neural network augmentation: Generating faces for affect analysis,” International Journal of Computer Vision, pp. 1–30, 2020.
  25. D. Kollias and S. P. Zafeiriou, “Exploiting multi-cnn features in cnn-rnn based dimensional emotion recognition on the omg in-the-wild dataset,” IEEE Transactions on Affective Computing, 2020.
  26. S. Zafeiriou, D. Kollias, M. A. Nicolaou, A. Papaioannou, G. Zhao, and I. Kotsia, “Aff-wild: Valence and arousal ‘in-the-wild’challenge,” in Computer Vision and Pattern Recognition Workshops (CVPRW), 2017 IEEE Conference on.   IEEE, 2017, pp. 1980–1987.
  27. D. Kollias, M. A. Nicolaou, I. Kotsia, G. Zhao, and S. Zafeiriou, “Recognition of affect in the wild using deep neural networks,” in Computer Vision and Pattern Recognition Workshops (CVPRW), 2017 IEEE Conference on.   IEEE, 2017, pp. 1972–1979.
  28. A. Psaroudakis and D. Kollias, “Mixaugment & mixup: Augmentation methods for facial expression recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 2367–2375.
  29. D. Kollias and S. Zafeiriou, “Expression, affect, action unit recognition: Aff-wild2, multi-task learning and arcface,” arXiv preprint arXiv:1910.04855, 2019.
  30. D. Kollias, A. Schulc, E. Hajiyev, and S. Zafeiriou, “Analysing affective behavior in the first abaw 2020 competition,” in 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020)(FG).   IEEE Computer Society, 2020, pp. 794–800.
  31. D. Kollias and S. Zafeiriou, “Affect analysis in-the-wild: Valence-arousal, expressions, action units and a unified framework,” arXiv preprint arXiv:2103.15792, 2021.
  32. D. Kollias, V. Sharmanska, and S. Zafeiriou, “Distribution matching for heterogeneous multi-task learning: a large-scale face study,” arXiv preprint arXiv:2105.03790, 2021.
  33. D. Kollias, “Abaw: Learning from synthetic data & multi-task learning challenges,” in European Conference on Computer Vision.   Springer, 2023, pp. 157–172.
  34. D. Kollias, A. Arsenos, and S. Kollias, “Domain adaptation, explainability & fairness in ai for medical image analysis: Diagnosis of covid-19 based on 3-d chest ct-scans,” arXiv preprint arXiv:2403.02192, 2024.
  35. D. Kollias, P. Tzirakis, A. Cowen, S. Zafeiriou, C. Shao, and G. Hu, “The 6th affective behavior analysis in-the-wild (abaw) competition,” arXiv preprint arXiv:2402.19344, 2024.
  36. D. Kollias, A. Tagaris, A. Stafylopatis, S. Kollias, and G. Tagaris, “Deep neural architectures for prediction in healthcare,” Complex & Intelligent Systems, vol. 4, no. 2, pp. 119–131, 2018.
  37. A. Arsenos, A. Davidhi, D. Kollias, P. Prassopoulos, and S. Kollias, “Data-driven covid-19 detection through medical imaging,” in 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW).   IEEE, 2023, pp. 1–5.
  38. A. Arsenos, D. Kollias, E. Petrongonas, C. Skliros, and S. Kollias, “Uncertainty-guided contrastive learning for single source domain generalisation,” in ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).   IEEE, 2024, pp. 6935–6939.
  39. D. Gerogiannis, A. Arsenos, D. Kollias, D. Nikitopoulos, and S. Kollias, “Covid-19 computer-aided diagnosis through ai-assisted ct imaging analysis: Deploying a medical ai system,” arXiv preprint arXiv:2403.06242, 2024.
  40. A. Arsenos, E. Petrongonas, O. Filippopoulos, C. Skliros, D. Kollias, and S. Kollias, “Nefeli: A deep-learning detection and tracking pipeline for enhancing autonomy in advanced air mobility,” Available at SSRN 4674579.
  41. S. P. Bharati, Y. Wu, Y. Sui, C. Padgett, and G. Wang, “Real-time obstacle detection and tracking for sense-and-avoid mechanism in uavs,” IEEE Transactions on Intelligent Vehicles, vol. 3, no. 2, pp. 185–197, 2018.
  42. S. Ghosh, J. Patrikar, B. Moon, M. M. Hamidi, and S. Scherer, “Airtrack: Onboard deep learning framework for long-range aircraft detection and tracking,” 2023.
  43. “Airborne object tracking dataset,” https://registry.opendata.aws/airborne-object-tracking, accessed: 2023-07-23.
  44. Y. Zheng, Z. Chen, D. Lv, Z. Li, Z. Lan, and S. Zhao, “Air-to-air visual detection of micro-uavs: An experimental evaluation of deep learning,” IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 1020–1027, 2021.
  45. Z. W. Lee, W. H. Chin, and H. W. Ho, “Air-to-air micro air vehicle interceptor with an embedded mechanism and deep learning,” Aerospace Science and Technology, vol. 135, p. 108192, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1270963823000895
  46. R. Opromolla and G. Fasano, “Visual-based obstacle detection and tracking, and conflict detection for small uas sense and avoid,” Aerospace Science and Technology, vol. 119, p. 107167, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1270963821006775
  47. M. Monajjemi, S. Mohaimenianpour, and R. Vaughan, “Uav, come to me: End-to-end, multi-scale situated hri with an uninstrumented human and a distant uav,” in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2016, pp. 4410–4417.
  48. X. Xu, S. Zhuge, C. Li, C. Ning, L. Zhong, B. Lin, X. Yang, and X. Zhang, “A vision-only relative distance calculation method for multi-uav systems,” Aerospace Science and Technology, vol. 142, p. 108665, 2023.
  49. D. Silva, N. Jourdan, and N. Gählert, “Long range object-level monocular depth estimation for uavs,” in Scandinavian Conference on Image Analysis.   Springer, 2023, pp. 325–340.
  50. P. K. Nathan Silberman, Derek Hoiem and R. Fergus, “Indoor segmentation and support inference from rgbd images,” in ECCV, 2012.
  51. Y. Ming, X. Meng, C. Fan, and H. Yu, “Deep learning for monocular depth estimation: A review,” Neurocomputing, vol. 438, pp. 14–33, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0925231220320014
  52. Y. Cao, Z. Wu, and C. Shen, “Estimating depth from monocular images as classification using deep fully convolutional residual networks,” 2017.
  53. G. Fasano, D. Accado, A. Moccia, and D. Moroney, “Sense and avoid for unmanned aircraft systems,” IEEE Aerospace and Electronic Systems Magazine, vol. 31, no. 11, pp. 82–110, 2016.
  54. Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.
  55. I. Laina, C. Rupprecht, V. Belagiannis, F. Tombari, and N. Navab, “Deeper depth prediction with fully convolutional residual networks,” 2016.
  56. C. Chen, Q. Chen, J. Xu, and V. Koltun, “Learning to see in the dark,” 2018.
  57. S. Ben-David, J. Blitzer, K. Crammer, A. Kulesza, F. Pereira, and J. W. Vaughan, “A theory of learning from different domains,” Machine Learning, vol. 79, pp. 151–175, 2010.
  58. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” 2015.
  59. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2017.
  60. G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov, “Improving neural networks by preventing co-adaptation of feature detectors,” 2012.
Citations (9)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com