Efficient Model Learning for Human-Robot Collaborative Tasks (1405.6341v1)

Published 24 May 2014 in cs.RO, cs.AI, cs.LG, and cs.SY

Abstract: We present a framework for learning human user models from joint-action demonstrations that enables the robot to compute a robust policy for a collaborative task with a human. The learning takes place completely automatically, without any human intervention. First, we describe the clustering of demonstrated action sequences into different human types using an unsupervised learning algorithm. These demonstrated sequences are also used by the robot to learn a reward function that is representative for each type, through the employment of an inverse reinforcement learning algorithm. The learned model is then used as part of a Mixed Observability Markov Decision Process formulation, wherein the human type is a partially observable variable. With this framework, we can infer, either offline or online, the human type of a new user that was not included in the training set, and can compute a policy for the robot that will be aligned to the preference of this new user and will be robust to deviations of the human actions from prior demonstrations. Finally we validate the approach using data collected in human subject experiments, and conduct proof-of-concept demonstrations in which a person performs a collaborative task with a small industrial robot.

Authors (4)

Stefanos Nikolaidis (65 papers)
Keren Gu (3 papers)
Ramya Ramakrishnan (6 papers)
Julie Shah (38 papers)

Citations (211)

View on Semantic Scholar

Summary

The paper introduces a two-phase approach that uses unsupervised clustering to identify human action strategies and tailor robot responses.
It employs a Mixed Observability Markov Decision Process to manage uncertainty and dynamically adjust to human preferences.
Empirical validation on collaborative tasks demonstrates improved classification accuracy and adaptability over traditional state-based methods.

Efficient Model Learning for Human-Robot Collaborative Tasks

The paper, "Efficient Model Learning for Human-Robot Collaborative Tasks," presents a sophisticated framework aimed at advancing the ability of robots to predict and adapt to human behavior during collaborative tasks. This work is motivated by the increasing presence of robots in human-centric environments, necessitating an enhancement of the robots' capacity to integrate seamlessly into human group dynamics without explicit human guidance or intervention.

Framework Overview

The authors propose a framework that employs a two-phased approach. Initially, the framework focuses on learning human user models from demonstrated action sequences belonging to human teams. This learning is achieved through clustering techniques, allowing for a classification of different action sequences into identifiable human types. These clusters inform the robot of varied human preferences within the task domain.

Central to the model is its use of the Mixed Observability Markov Decision Process (MOMDP) to address the partially observable nature of human preferences. Each human preference type, inferred via clustering, serves as a latent variable within the MOMDP. The robot learns a reward function representative of each human type using inverse reinforcement learning (IRL), allowing for a personalized and adaptive robotic response.

Technical Contributions

Among the several innovations presented, the clustering of human types via unsupervised learning stands out, facilitating the automatic identification of human action strategies. Notably, this approach eschews the need for labelled data, traditionally a bottleneck in similar tasks.

The paper also explores the MOMDP formulation, leveraging the structure's capacity to manage limited observability and uncertainty associated with human-robot collaborative tasks. By adopting inverse reinforcement learning, the model bypasses the need for manually specified rewards, effectively learning them from unlabeled data. This approach enhances scalability and generalizability, making the framework applicable across diverse collaborative scenarios.

As the policy is generated using the MOMDP framework, where the reward function adjusts according to human type, it ensures robot actions remain aligned with human collaborator preferences, even as these preferences deviate from those previously demonstrated.

Empirical Validation

The framework's validation was carried out on a human-robot place-and-drill task, complemented by a large-scale hand-finishing task. Through these tests, the proposed method demonstrated a high classification accuracy of human types, aligning closely with expert human labelling. Crucially, the resulting MOMDP policies exhibited robust performance despite deviations from demonstrated human behaviors. Comparative results highlighted the advantages of the proposed framework over existing state-based collaborative algorithms, showing enhanced reliability and adaptability in managing deviations.

Implications and Future Directions

This research underscores several implications for the future of AI and robotics in human collaborative environments. Primarily, it proposes a pathway towards more autonomous and capable robotic systems that can dynamically adapt to human partners' distinct operational styles, potentially transforming human-robot interaction dynamics across various sectors, like manufacturing and service robotics.

Looking ahead, the paper invites exploration into the automatic estimation of observation functions and further refinement of learned user models. The alignment of robot behavior to human preferences without direct supervision or intervention could be extended to complex, scenario-specific constraints, broadening the applicability of the proposed framework.

In conclusion, the framework delineated in the paper provides a robust mechanism for aligning robotic actions with human preferences in collaborative environments, marking a substantive step towards more seamless human-robot cooperation. Future research efforts might focus on enhancing the scalability and applicability of these models in varied interactive scenarios, potentially revolutionizing interactive collaborative frameworks in multi-agent systems.

PDF Markdown