Purpose for Open-Ended Learning Robots: A Computational Taxonomy, Definition, and Operationalisation (2403.02514v1)
Abstract: Autonomous open-ended learning (OEL) robots are able to cumulatively acquire new skills and knowledge through direct interaction with the environment, for example relying on the guidance of intrinsic motivations and self-generated goals. OEL robots have a high relevance for applications as they can use the autonomously acquired knowledge to accomplish tasks relevant for their human users. OEL robots, however, encounter an important limitation: this may lead to the acquisition of knowledge that is not so much relevant to accomplish the users' tasks. This work analyses a possible solution to this problem that pivots on the novel concept of purpose'. Purposes indicate what the designers and/or users want from the robot. The robot should use internal representations of purposes, called here
desires', to focus its open-ended exploration towards the acquisition of knowledge relevant to accomplish them. This work contributes to develop a computational framework on purpose in two ways. First, it formalises a framework on purpose based on a three-level motivational hierarchy involving: (a) the purposes; (b) the desires, which are domain independent; (c) specific domain dependent state-goals. Second, the work highlights key challenges highlighted by the framework such as: the purpose-desire alignment problem', the
purpose-goal grounding problem', and the `arbitration between desires'. Overall, the approach enables OEL robots to learn in an autonomous way but also to focus on acquiring goals and skills that meet the purposes of the designers and users.
- V. G. Santucci, G. Baldassarre, and M. Mirolli, “Which is the best intrinsic motivation signal for learning multiple skills?” Frontiers in Neurorobotics, vol. 7, no. 22, pp. e1–14, 2013.
- S. Doncieux, D. Filliat, N. Díaz-Rodríguez, T. Hospedales, R. Duro, A. Coninx, D. M. Roijers, B. Girard, N. Perrin, and O. Sigaud, “Open-ended learning: A conceptual framework based on representational redescription.” Frontiers in neurorobotics, vol. 12, no. 59, pp. e1–6, 2018.
- O. Sigaud, G. Baldassarre, C. Colas, S. Doncieux, R. Duro, N. Perrin-Gilbert, and V. G. Santucci, “A definition of open-ended learning problems for goal-conditioned agents,” arXiv, Doi: 10.48550/ARXIV.2311.00344.
- E. Cartoni, D. Montella, J. Triesch, and G. Baldassarre, “An open-ended learning architecture to face the real 2020 simulated robot competition,” arXiv preprint arXiv:2011.13880v1, 2020.
- K. Seepanomwan, V. G. Santucci, and G. Baldassarre, “Intrinsically motivated discovered outcomes boost user’s goals achievement in a humanoid robot,” in Joint IEEE International Conference on Development and Learning and Epigenetic Robotics, 2017, pp. 178–183, 18-21/09/2017, Lisbon, Portugal.
- C. Brian, The Alignment Problem: Machine Learning and Human Values. New York: W.W. Norton & Company.
- M. Khamassi and R. Chatila, “Strong or weak alignment of ai systems with humans values?” T.B.D. or ArXiv link for the moment, 2024.
- G. I. Parisi, R. Kemker, J. L. Part, C. Kanan, and S. Wermter, “Continual lifelong learning with neural networks: A review,” Neural Networks, vol. 113, pp. 54–71, 2019.
- M. Ring, “Continual learning in reinforcement learning environments,” Ph.D. dissertation, 1994.
- Y. Bengio, J. Louradour, R. Collobert, and J. Weston, “Curriculum learning,” in Proceedings of the 26th annual international conference on machine learning (ICML), 2009, pp. 41–48, 14-18/07/2009, Montreal, Quebec, Canada.
- P.-Y. Oudeyer, F. Kaplan, and V. Hafner, “Intrinsic motivation systems for autonomous mental development,” IEEE transactions on evolutionary computation, vol. 11, no. 6, 2007.
- G. Baldassarre, “What are intrinsic motivations? a biological perspective,” in Proceedings of the International Conference on Development and Learning and Epigenetic Robotics, 2011, pp. E1–8, Frankfurt am Main, Germany, 24–27/08/2011.
- A. Barto, M. Mirolli, and G. Baldassarre, “Novelty or surprise?” Frontiers in Psychology – Cognitive Science, vol. 4, no. 907, pp. e1–15, 2013.
- E. Hubinger, C. van Merwijk, V. Mikulik, J. Skalse, and S. Garrabrant, “Risks from learned optimization in advanced machine learning systems,” arXiv preprint, 2019, available online at: https://arxiv.org/pdf/1906.01820.pdf. [Online]. Available: https://arxiv.org/abs/1906.01820
- G. Baldassarre, W. Lord, G. Granato, and V. G. Santucci, “An embodied agent learning affordances with intrinsic motivations and solving extrinsic tasks with attention and one-step planning,” Frontiers in Neurorobotics, vol. 13, no. 45, pp. e1–26, 2019.
- K. Friston, F. Rigoli, D. Ognibene, C. Mathys, T. Fitzgerald, and G. Pezzulo, “Active inference and epistemic value,” Cognitive neuroscience, vol. 6, no. 4, pp. 187–214, 2015.
- T. Taniguchi, S. Murata, M. Suzuki, D. Ognibene, P. Lanillos, E. Ugur, L. Jamone, T. Nakamura, A. Ciria, B. Lara et al., “World models and predictive coding for cognitive and developmental robotics: frontiers and challenges,” Advanced Robotics, pp. 1–27, 2023.
- M. Khamassi, G. Velentzas, T. Tsitsimis, and C. Tzafestas, “Robot fast adaptation to changes in human engagement during simulated dynamic social interaction with active exploration in parameterized reinforcement learning,” IEEE Transactions on Cognitive and Developmental Systems, vol. 10, no. 4, pp. 881–893, 2018.
- G. Konidaris and A. Barto, “An adaptive robot motivational system,” in International conference on simulation of adaptive behavior. Springer, 2006, pp. 346–356.
- I. Cos, L. Canamero, G. M. Hayes, and A. Gillies, “Hedonic value: Enhancing adaptation for motivated agents,” Adaptive Behavior, vol. 21, no. 6, pp. 465–483, 2013.
- G. Baldassarre, “A planning modular neural-network robot for asynchronous multi-goal navigation tasks,” in Proceedings of the 2001 Fourth European Workshop on Advanced Mobile Robots-EUROBOT, 2001, pp. 223–230.
- S. Doncieux, N. Bredeche, L. L. Goff, B. Girard, A. Coninx, O. Sigaud, M. Khamassi, N. Díaz-Rodríguez, D. Filliat, T. Hospedales et al., “Dream architecture: a developmental approach to open-ended learning in robotics,” arXiv preprint arXiv:2005.06223, 2020.
- G. Velentzas, C. S. Tzafestas, and M. Khamassi, “Memory development with heteroskedastic bayesian last layer probabilistic deep neural networks,” in Workshop on World Models and Predictive Coding in Cognitive Robotics at 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2023), 2023.
- Gianluca Baldassarre (17 papers)
- Richard J. Duro (4 papers)
- Emilio Cartoni (9 papers)
- Mehdi Khamassi (10 papers)
- Alejandro Romero (4 papers)
- Vieri Giuliano Santucci (11 papers)