Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 81 tok/s
Gemini 2.5 Pro 44 tok/s Pro
GPT-5 Medium 22 tok/s Pro
GPT-5 High 25 tok/s Pro
GPT-4o 81 tok/s Pro
Kimi K2 172 tok/s Pro
GPT OSS 120B 434 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Dynamic Open Vocabulary Enhanced Safe-landing with Intelligence (DOVESEI) (2308.11471v6)

Published 22 Aug 2023 in cs.RO, cs.AI, and cs.CV

Abstract: This work targets what we consider to be the foundational step for urban airborne robots, a safe landing. Our attention is directed toward what we deem the most crucial aspect of the safe landing perception stack: segmentation. We present a streamlined reactive UAV system that employs visual servoing by harnessing the capabilities of open vocabulary image segmentation. This approach can adapt to various scenarios with minimal adjustments, bypassing the necessity for extensive data accumulation for refining internal models, thanks to its open vocabulary methodology. Given the limitations imposed by local authorities, our primary focus centers on operations originating from altitudes of 100 meters. This choice is deliberate, as numerous preceding works have dealt with altitudes up to 30 meters, aligning with the capabilities of small stereo cameras. Consequently, we leave the remaining 20m to be navigated using conventional 3D path planning methods. Utilizing monocular cameras and image segmentation, our findings demonstrate the system's capability to successfully execute landing maneuvers at altitudes as low as 20 meters. However, this approach is vulnerable to intermittent and occasionally abrupt fluctuations in the segmentation between frames in a video stream. To address this challenge, we enhance the image segmentation output by introducing what we call a dynamic focus: a masking mechanism that self adjusts according to the current landing stage. This dynamic focus guides the control system to avoid regions beyond the drone's safety radius projected onto the ground, thus mitigating the problems with fluctuations. Through the implementation of this supplementary layer, our experiments have reached improvements in the landing success rate of almost tenfold when compared to global segmentation. All the source code is open source and available online (github.com/MISTLab/DOVESEI).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (18)
  1. S. Garber, “Style guide for nasa history authors and editors,” National Aeronautics and Space Administration, 2012, accessed: 2022-02-17. [Online]. Available: https://history.nasa.gov/styleguide.html
  2. M. Koren, “The outdated language of space travel,” Jul 2019, accessed: 2022-02-17. [Online]. Available: https://www.theatlantic.com/science/archive/2019/07/manned-spaceflight-nasa/594835/
  3. M. K. Mittal, A. Valada, and W. Burgard, “Vision-based autonomous landing in catastrophe-struck environments,” ArXiv, vol. abs/1809.05700, 2018.
  4. E. Chatzikalymnios and K. Moustakas, “Landing site detection for autonomous rotor wing uavs using visual and structural information,” Journal of Intelligent & Robotic Systems, vol. 104, 02 2022.
  5. C. Forster, M. Faessler, F. Fontana, M. Werlberger, and D. Scaramuzza, “Continuous on-board monocular-vision-based elevation mapping applied to autonomous landing of micro aerial vehicles,” in 2015 IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 111–118.
  6. M. Rabah, A. Rohan, M. Talha, K.-H. Nam, and S. Kim, “Autonomous vision-based target detection and safe landing for uav,” International Journal of Control, Automation and Systems, 10 2018.
  7. J. Park, Y. Kim, and S. Kim, “Landing site searching and selection algorithm development using vision system and its application to quadrotor,” IEEE Transactions on Control Systems Technology, vol. 23, no. 2, pp. 488–503, 2015.
  8. “Luxonis oak-d lite documentation,” https://docs.luxonis.com/projects/hardware/en/latest/pages/DM9095/, accessed: 2023-08-15.
  9. A. Cesetti, E. Frontoni, A. Mancini, P. Zingaretti, and S. Longhi, “A vision-based guidance system for uav navigation and safe landing using natural landmarks,” Journal of Intelligent and Robotic Systems, vol. 57, pp. 233–257, 01 2010.
  10. B. E. Cohanim, J. A. Hoffman, and T. Brady, “Onboard and self-contained landing site selection for planetary landers/hoppers,” in 2014 IEEE Aerospace Conference, 2014, pp. 1–13.
  11. L. Bartolomei, Y. Kompis, L. Teixeira, and M. Chli, “Autonomous emergency landing for multicopters using deep reinforcement learning,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022, pp. 3392–3399.
  12. A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark et al., “Learning transferable visual models from natural language supervision,” in International conference on machine learning.   PMLR, 2021, pp. 8748–8763.
  13. T. Lüddecke and A. Ecker, “Image segmentation using text and image prompts,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 7086–7096.
  14. D. Thomas, W. Woodall, and E. Fernandez, “Next-generation ROS: Building on DDS,” in ROSCon Chicago 2014.   Mountain View, CA: Open Robotics, sep 2014. [Online]. Available: https://vimeo.com/106992622
  15. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255.
  16. “Google maps,” https://www.google.com/maps/, accessed: 2023-08-19.
  17. A. Azulay and Y. Weiss, “Why do deep convolutional networks generalize so poorly to small image transformations?” Journal of Machine Learning Research, vol. 20, no. 184, pp. 1–25, 2019.
  18. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13.   Springer, 2014, pp. 740–755.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Github Logo Streamline Icon: https://streamlinehq.com