Integration of Large Language Models within Cognitive Architectures for Autonomous Robots (2309.14945v2)
Abstract: Symbolic reasoning systems have been used in cognitive architectures to provide inference and planning capabilities. However, defining domains and problems has proven difficult and prone to errors. Moreover, LLMs have emerged as tools to process natural language for different tasks. In this paper, we propose the use of LLMs to tackle these problems. This way, this paper proposes the integration of LLMs in the ROS 2-integrated cognitive architecture MERLIN2 for autonomous robots. Specifically, we present the design, development and deployment of how to leverage the reasoning capabilities of LLMs inside the deliberative processes of MERLIN2. As a result, the deliberative system is updated from a PDDL-based planner system to a natural language planning system. This proposal is evaluated quantitatively and qualitatively, measuring the impact of incorporating the LLMs in the cognitive architecture. Results show that a classical approach achieves better performance but the proposed solution provides an enhanced interaction through natural language.
- Learning to reason over scene graphs: a case study of finetuning gpt-2 into a robot language model for grounded task planning. Frontiers in Robotics and AI, 10, 2023. Cited by: 0; All Open Access, Gold Open Access, Green Open Access.
- Generalized planning in pddl domains with pretrained large language models. arXiv preprint arXiv:2305.11014, 2023.
- Progprompt: Generating situated robot task plans using large language models. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 11523–11530, 2023.
- Open-vocabulary queryable scene representations for real world planning. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 11509–11522, 2023.
- Exploring the limitations of using large language models to fix planning tasks. In ICAPS23, 2023.
- Anis Koubaa. Rosgpt: Next-generation human-robot interaction with chatgpt and ros, 2023.
- Chatsim: Underwater simulation with natural language prompting. arXiv preprint arXiv:2308.04029, 2023.
- Modelscope-agent: Building your customizable agent system with open-source large language models. arXiv preprint arXiv:2309.00986, 2023.
- X-factr: Multilingual factual knowledge retrieval from pretrained language models. arXiv preprint arXiv:2010.06189, 2020.
- Decoding prompt syntax: Analysing its impact on knowledge retrieval in large language models. In Companion Proceedings of the ACM Web Conference 2023, pages 1145–1149, 2023.
- Using large language models for interpreting autonomous robots behaviors. In Hybrid Artificial Intelligent Systems, pages 533–544, Cham, 2023. Springer Nature Switzerland.
- Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837, 2022.
- Large language models are zero-shot reasoners, 2023.
- Deliberation for autonomous robots: A survey. Artificial Intelligence, 247:10–44, 2017.
- Rodney A Brooks. New approaches to robotics. Science, 253(5025):1227–1232, 1991.
- Experiences with an architecture for intelligent, reactive agents. Journal of Experimental & Theoretical Artificial Intelligence, 9(2-3):237–256, 1997.
- Aura: Principles and practice in review. Journal of Experimental & Theoretical Artificial Intelligence, 9(2-3):175–189, 1997.
- Miguel Á. González-Santamarta. llama_ros. https://github.com/mgonzs13/llama_ros, April 2023.
- Merlin2: Machined ros 2 planing. Software Impacts, 15:100477, 2023.
- Cognitive robotics. Foundations of artificial intelligence, 3:869–886, 2008.
- A survey of cognitive architectures in the past 20 years. IEEE transactions on cybernetics, 48(12):3280–3290, 2018.
- OpenAI. Gpt-4 technical report. https://arxiv.org/abs/2303.08774, 2023.
- Llama: Open and efficient foundation language models, 2023.
- Llama 2: Open foundation and fine-tuned chat models, 2023.
- Integer quantization for deep learning inference: Principles and empirical evaluation. arXiv preprint arXiv:2004.09602, 2020.
- Deep learning with low precision by half-wave gaussian quantization. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5918–5926, 2017.
- Neural networks with few multiplications. arXiv preprint arXiv:1510.03009, 2015.
- Fixed point quantization of deep convolutional networks. In International conference on machine learning, pages 2849–2858. PMLR, 2016.
- GitHub - ggerganov/llama.cpp: Port of Facebook’s LLaMA model in C/C++ — github.com. https://github.com/ggerganov/llama.cpp, 2023.
- Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca, 2023.
- Judging llm-as-a-judge with mt-bench and chatbot arena, 2023.
- Wizardlm: Empowering large language models to follow complex instructions, 2023.
- NousResearch/Nous-Hermes-13b · Hugging Face — huggingface.co. https://huggingface.co/NousResearch/Nous-Hermes-13b. [Accessed 11-09-2023].
- AIDC-ai-business/Marcoroni-13B · Hugging Face — huggingface.co. https://huggingface.co/AIDC-ai-business/Marcoroni-13B. [Accessed 13-09-2023].
- Leveraging pre-trained large language models to construct and utilize world models for model-based task planning, 2023.
- Llm-planner: Few-shot grounded planning for embodied agents with large language models, 2023.
- Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601, 2023.
- Graph of thoughts: Solving elaborate problems with large language models. arXiv preprint arXiv:2308.09687, 2023.
- Pddl planning with pretrained large language models. In NeurIPS 2022 Foundation Models for Decision Making Workshop, 2022.
- Progprompt: program generation for situated robot task planning using large language models. Autonomous Robots, pages 1–14, 2023.
- 40 years of cognitive architectures: core cognitive abilities and practical applications. Artificial Intelligence Review, 53(1):17–94, 2020.
- Himop: a three-component architecture to create more human-acceptable social-assistive robots: motivational architecture for assistive robots. Cognitive processing, 19:233–244, 2018.
- Pddl2.1: An extension to pddl for expressing temporal planning domains. J. Artif. Intell. Res. (JAIR), 20:61–124, 12 2003.
- Mobar: a hierarchical action-oriented autonomous control architecture. Journal of Intelligent & Robotic Systems, 94:745–760, 2019.
- The cortex cognitive robotics architecture: Use cases. Cognitive systems research, 55:107–123, 2019.
- Depicting probabilistic context awareness knowledge in deliberative architectures. Natural Computing, pages 1–12, 2022.
- Client-server approach for managing visual attention, integrated in a cognitive architecture for a social robot. Frontiers in Neurorobotics, 15:630386, 2021.
- On the implementation of behavior trees in robotics. IEEE Robotics and Automation Letters, 6(3):5929–5936, 2021.
- Robot operating system 2: Design, architecture, and uses in the wild. Science Robotics, 7(66):eabm6074, 2022.
- A systematic survey of prompt engineering in large language models: Techniques and applications. arXiv preprint arXiv:2402.07927, 2024.
- Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33:9459–9474, 2020.
- Harrison Chase. LangChain. https://github.com/hwchase17/langchain, October 2022.
- Merlin a cognitive architecture for service robots. Applied Sciences, 10(17):5989, 2020.
- Forward-chaining partial-order planning. In ICAPS 2010 - Proceedings of the 20th International Conference on Automated Planning and Scheduling, pages 42–49, 01 2010.
- YASMIN: Yet another state machine. In Danilo Tardioli, Vicente Matellán, Guillermo Heredia, Manuel F. Silva, and Lino Marques, editors, ROBOT2022: Fifth Iberian Robotics Conference, pages 528–539, Cham, 2023. Springer International Publishing.
- the AI-native open-source embedding database — trychroma.com. https://www.trychroma.com/. [Accessed 13-09-2023].
- Grammatical evolution. Springer, 2003.
- The marathon 2: A navigation system. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020.
- Francisco J. Rodríguez-Lera (10 papers)
- Ángel Manuel Guerrero-Higueras (8 papers)
- Vicente Matellán-Olivera (7 papers)
- Miguel Á. González-Santamarta (5 papers)