Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 81 tok/s
Gemini 2.5 Pro 44 tok/s Pro
GPT-5 Medium 22 tok/s Pro
GPT-5 High 25 tok/s Pro
GPT-4o 81 tok/s Pro
Kimi K2 172 tok/s Pro
GPT OSS 120B 434 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

PEACE: Prompt Engineering Automation for CLIPSeg Enhancement for Safe-Landing Zone Segmentation (2310.00085v5)

Published 29 Sep 2023 in cs.RO

Abstract: Safe landing is essential in robotics applications, from industrial settings to space exploration. As artificial intelligence advances, we have developed PEACE (Prompt Engineering Automation for CLIPSeg Enhancement), a system that automatically generates and refines prompts for identifying landing zones in changing environments. Traditional approaches using fixed prompts for open-vocabulary models struggle with environmental changes and can lead to dangerous outcomes when conditions are not represented in the predefined prompts. PEACE addresses this limitation by dynamically adapting to shifting data distributions. Our key innovation is the dual segmentation of safe and unsafe landing zones, allowing the system to refine the results by removing unsafe areas from potential landing sites. Using only monocular cameras and image segmentation, PEACE can safely guide descent operations from 100 meters to altitudes as low as 20 meters. The testing shows that PEACE significantly outperforms the standard CLIP and CLIPSeg prompting methods, improving the successful identification of safe landing zones from 57% to 92%. We have also demonstrated enhanced performance when replacing CLIPSeg with FastSAM. The complete source code is available as an open-source software.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. S. Garber, “Style guide for nasa history authors and editors,” National Aeronautics and Space Administration, 2012, accessed: 2022-02-17. [Online]. Available: https://history.nasa.gov/styleguide.html
  2. M. Koren, “The outdated language of space travel,” Jul 2019, accessed: 2022-02-17. [Online]. Available: https://www.theatlantic.com/science/archive/2019/07/manned-spaceflight-nasa/594835/
  3. H. M. Bong, R. Zhang, R. de Azambuja, and G. Beltrame, “Dynamic open vocabulary enhanced safe-landing with intelligence (dovesei),” IROS: Last-Mile Robotics Workshop, 2023.
  4. T. Lüddecke and A. Ecker, “Image segmentation using text and image prompts,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 7086–7096.
  5. A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark et al., “Learning transferable visual models from natural language supervision,” in International conference on machine learning.   PMLR, 2021, pp. 8748–8763.
  6. pharmapsychotic, “Clip interrogator,” https://github.com/pharmapsychotic/clip-interrogator, 2022.
  7. M. K. Mittal, A. Valada, and W. Burgard, “Vision-based autonomous landing in catastrophe-struck environments,” ArXiv, vol. abs/1809.05700, 2018.
  8. E. Chatzikalymnios and K. Moustakas, “Landing site detection for autonomous rotor wing uavs using visual and structural information,” Journal of Intelligent & Robotic Systems, vol. 104, 02 2022.
  9. L. Bartolomei, Y. Kompis, L. Teixeira, and M. Chli, “Autonomous emergency landing for multicopters using deep reinforcement learning,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022, pp. 3392–3399.
  10. C. Forster, M. Faessler, F. Fontana, M. Werlberger, and D. Scaramuzza, “Continuous on-board monocular-vision-based elevation mapping applied to autonomous landing of micro aerial vehicles,” in 2015 IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 111–118.
  11. J. Park, Y. Kim, and S. Kim, “Landing site searching and selection algorithm development using vision system and its application to quadrotor,” IEEE Transactions on Control Systems Technology, vol. 23, no. 2, pp. 488–503, 2015.
  12. M. Rabah, A. Rohan, M. Talha, K.-H. Nam, and S. Kim, “Autonomous vision-based target detection and safe landing for uav,” International Journal of Control, Automation and Systems, 10 2018.
  13. “Luxonis oak-d lite documentation,” https://docs.luxonis.com/projects/hardware/en/latest/pages/DM9095/, accessed: 2023-08-15.
  14. A. Cesetti, E. Frontoni, A. Mancini, P. Zingaretti, and S. Longhi, “A vision-based guidance system for uav navigation and safe landing using natural landmarks,” Journal of Intelligent and Robotic Systems, vol. 57, pp. 233–257, 01 2010.
  15. B. E. Cohanim, J. A. Hoffman, and T. Brady, “Onboard and self-contained landing site selection for planetary landers/hoppers,” in 2014 IEEE Aerospace Conference, 2014, pp. 1–13.
  16. J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431–3440.
  17. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18.   Springer, 2015, pp. 234–241.
  18. Y. Du, F. Wei, Z. Zhang, M. Shi, Y. Gao, and G. Li, “Learning to prompt for open-vocabulary object detection with vision-language model,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 14 084–14 093.
  19. K. Zhou, J. Yang, C. C. Loy, and Z. Liu, “Learning to prompt for vision-language models,” International Journal of Computer Vision, vol. 130, no. 9, pp. 2337–2348, 2022.
  20. D. Thomas, W. Woodall, and E. Fernandez, “Next-generation ROS: Building on DDS,” in ROSCon Chicago 2014.   Mountain View, CA: Open Robotics, sep 2014. [Online]. Available: https://vimeo.com/106992622
  21. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255.
  22. J. Li, D. Li, C. Xiong, and S. Hoi, “Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation,” in International Conference on Machine Learning.   PMLR, 2022, pp. 12 888–12 900.
  23. A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving language understanding by generative pre-training,” 2018.
  24. R. Luo and G. Shakhnarovich, “Comprehension-guided referring expressions,” CoRR, vol. abs/1701.03439, 2017. [Online]. Available: http://arxiv.org/abs/1701.03439
  25. G. U. of Technology, “Aerial semantic segmentation drone dataset,” http://dronedataset.icg.tugraz.at.
  26. “Google maps,” https://www.google.com/maps/, accessed: 2023-08-19.
  27. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13.   Springer, 2014, pp. 740–755.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Github Logo Streamline Icon: https://streamlinehq.com

GitHub

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube