ALT-Pilot: Autonomous navigation with Language augmented Topometric maps (2310.02324v1)
Abstract: We present an autonomous navigation system that operates without assuming HD LiDAR maps of the environment. Our system, ALT-Pilot, relies only on publicly available road network information and a sparse (and noisy) set of crowdsourced language landmarks. With the help of onboard sensors and a language-augmented topometric map, ALT-Pilot autonomously pilots the vehicle to any destination on the road network. We achieve this by leveraging vision-LLMs pre-trained on web-scale data to identify potential landmarks in a scene, incorporating vision-language features into the recursive Bayesian state estimation stack to generate global (route) plans, and a reactive trajectory planner and controller operating in the vehicle frame. We implement and evaluate ALT-Pilot in simulation and on a real, full-scale autonomous vehicle and report improvements over state-of-the-art topometric navigation systems by a factor of 3x on localization accuracy and 5x on goal reachability
- T. Ort, K. Murthy, R. Banerjee, S. K. Gottipati, D. Bhatt, I. Gilitschenski, L. Paull, and D. Rus, “Maplite: Autonomous intersection navigation without a detailed prior map,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 556–563, 2020.
- D. Pannen, M. Liebner, W. Hempel, and W. Burgard, “How to keep hd maps for automated driving up to date,” in 2020 IEEE International Conference on Robotics and Automation (ICRA), 2020, pp. 2288–2294.
- T. Ort, L. Paull, and D. Rus, “Autonomous vehicle navigation in rural environments without detailed prior maps,” in 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 2040–2047.
- M. Elhousni, Z. Zhang, and X. Huang, “Lidar-osm-based vehicle localization in gps-denied environments by using constrained particle filter,” Sensors, vol. 22, no. 14, 2022. [Online]. Available: https://www.mdpi.com/1424-8220/22/14/5206
- S. Ninan and S. Rathinam, “Road descriptors for fast global localization on rural roads using openstreetmaps,” 2023.
- A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Clip: Connecting vision and language with localized narratives,” 2021.
- B. Li, K. Q. Weinberger, S. J. Belongie, V. Koltun, and R. Ranftl, “Language-driven semantic segmentation,” ArXiv, vol. abs/2201.03546, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:245836975
- G. Ghiasi, X. Gu, Y. Cui, and T.-Y. Lin, “Scaling open-vocabulary image segmentation with image-level labels,” in European Conference on Computer Vision. Springer, 2022, pp. 540–557.
- X. Dong, J. Bao, Y. Zheng, T. Zhang, D. Chen, H. Yang, M. Zeng, W. Zhang, L. Yuan, D. Chen et al., “Maskclip: Masked self-distillation advances contrastive language-image pretraining,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 10 995–11 005.
- K. Jatavallabhula, A. Kuwajerwala, Q. Gu, M. Omama, T. Chen, S. Li, G. Iyer, S. Saryazdi, N. Keetha, A. Tewari, J. Tenenbaum, C. de Melo, M. Krishna, L. Paull, F. Shkurti, and A. Torralba, “Conceptfusion: Open-set multimodal 3d mapping,” RSS, 2023.
- C. Huang, O. Mees, A. Zeng, and W. Burgard, “Visual language maps for robot navigation,” in Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), London, UK, 2023.
- N. M. M. Shafiullah, C. Paxton, L. Pinto, S. Chintala, and A. Szlam, “Clip-fields: Weakly supervised semantic fields for robotic memory,” RSS, 2022.
- J. Kerr, C. M. Kim, K. Goldberg, A. Kanazawa, and M. Tancik, “Lerf: Language embedded radiance fields,” ICCV, 2023.
- A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V. Koltun, “Carla: An open urban driving simulator,” in Proceedings of the 1st Annual Conference on Robot Learning, 2017, pp. 1–16.
- M. Werling, J. Ziegler, S. Kammel, and S. Thrun, “Optimal trajectory generation for dynamic street scenarios in a frenét frame,” in 2010 IEEE International Conference on Robotics and Automation, 2010, pp. 987–993.
- E. Horváth, C. Pozna, and M. Unger, “Real-time lidar-based urban road and sidewalk detection for autonomous vehicles,” Sensors, vol. 22, no. 1, 2022. [Online]. Available: https://www.mdpi.com/1424-8220/22/1/194
- S. Thrun, “Probabilistic robotics,” Communications of the ACM, vol. 45, no. 3, pp. 52–57, 2002.
- D. Fox, W. Burgard, F. Dellaert, and S. Thrun, “Monte carlo localization: Efficient position estimation for mobile robots,” in Proceedings of the Sixteenth National Conference on Artificial Intelligence (AAAI’99), 1999, pp. 343–349.
- D. Fox, “Adapting the sample size in particle filters through kld-sampling,” International Journal of Robotics Research, vol. 22, no. 12, pp. 985–1003, 2003.
- T. Shan and B. Englot, “Lego-loam: Lightweight and ground-optimized lidar odometry and mapping on variable terrain,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2018, pp. 4758–4765.
- F. J. Richards, “A flexible growth function for empirical use,” Journal of experimental Botany, vol. 10, no. 2, pp. 290–301, 1959.
- P. E. Hart, N. J. Nilsson, and B. Raphael, “A formal basis for the heuristic determination of minimum cost paths,” IEEE Transactions on Systems Science and Cybernetics, vol. 4, no. 2, pp. 100–107, 1968.
- G. M. Hoffmann, C. J. Tomlin, M. Montemerlo, and S. Thrun, “Autonomous automobile trajectory tracking for off-road driving: Controller design, experimental validation and racing,” in 2007 American Control Conference, 2007, pp. 2296–2301.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.