Papers
Topics
Authors
Recent
Search
2000 character limit reached

Large Language Models as Zero-Shot Human Models for Human-Robot Interaction

Published 6 Mar 2023 in cs.RO, cs.CL, cs.HC, and cs.LG | (2303.03548v2)

Abstract: Human models play a crucial role in human-robot interaction (HRI), enabling robots to consider the impact of their actions on people and plan their behavior accordingly. However, crafting good human models is challenging; capturing context-dependent human behavior requires significant prior knowledge and/or large amounts of interaction data, both of which are difficult to obtain. In this work, we explore the potential of large-LLMs -- which have consumed vast amounts of human-generated text data -- to act as zero-shot human models for HRI. Our experiments on three social datasets yield promising results; the LLMs are able to achieve performance comparable to purpose-built models. That said, we also discuss current limitations, such as sensitivity to prompts and spatial/numerical reasoning mishaps. Based on our findings, we demonstrate how LLM-based human models can be integrated into a social robot's planning process and applied in HRI scenarios. Specifically, we present one case study on a simulated trust-based table-clearing task and replicate past results that relied on custom models. Next, we conduct a new robot utensil-passing experiment (n = 65) where preliminary results show that planning with a LLM-based human model can achieve gains over a basic myopic plan. In summary, our results show that LLMs offer a promising (but incomplete) approach to human modeling for HRI.

Citations (37)

Summary

  • The paper demonstrates that LLMs can serve as zero-shot predictors for human behavior in HRI using extensive social datasets.
  • It employs benchmark datasets like MANNERS-DB, Trust-Transfer, and SocialIQA to evaluate LLM performance against specialized models.
  • The study highlights limitations in spatial reasoning and prompt sensitivity, suggesting the need for integration with additional learning models.

LLMs as Zero-Shot Human Models for Human-Robot Interaction

This essay provides a detailed exploration of the use of LLMs as zero-shot human models in Human-Robot Interaction (HRI), focusing on their capability to function as predictive models without additional training. The research demonstrates both the potential and limitations of LLMs in this domain through empirical studies and practical experiments.

Introduction

The research investigates the application of LLMs, traditionally used in NLP, as human models in HRI. These models, pre-trained on extensive datasets, have not been explicitly configured for human modeling in robotic contexts, yet they exhibit significant promise as zero-shot predictors for human behavior in social interactions. Figure 1

Figure 1: This study explores the application of LLMs as zero-shot human models in HRI, evaluating their effectiveness with benchmark datasets and demonstrating their use in trust-based scenarios.

Evaluation of LLMs on Social Datasets

Three prominent datasets were utilized to evaluate the predictive capabilities of LLMs: MANNERS-DB, Trust-Transfer, and SocialIQA. These datasets cover different aspects of human social behavior and interaction.

LLMs like FLAN-T5 and a variant of GPT-3.5 were compared to specialized models designed for these tasks under a zero-shot learning framework. Results indicate that LLMs can perform comparably to these specialized models on predictive tasks with no additional training data, albeit with certain limitations. Figure 2

Figure 2: Datasets and Example Prompts in Prediction Experiments showcasing LLM-based models' application across diversified HRI datasets.

Limitations of LLMs

The study identifies specific limitations of LLMs:

  1. Spatial and Numerical Reasoning: LLMs show deficiencies in tasks requiring spatial awareness or numerical computations, which compromises their application in HRI tasks requiring such reasoning.
  2. Prompt Sensitivity: Performance variability based on prompt design highlights that careful formulation is necessary to leverage LLMs effectively.

These weaknesses underscore the need for integrating LLMs with additional learning models that can address spatial and operational reasoning more robustly.

Planning for HRI with LLMs

The study progresses from prediction to the use of LLMs in planning for HRI scenarios, publishing results from two case studies: a table-clearing experiment and a utensil-passing experiment.

  1. Table-Clearing Experiment: Simulating a trust-oriented task, LLMs were integrated with planners to determine optimal robot actions in a shared environment, effectively replicating prior experimental results with satisfactory performance.
  2. Utensil-Passing Experiment: Designed to test trust dynamics further, this experiment highlighted LLMs' ability to adjust strategies based on human intervention or trust levels, showcasing their potential in active planning scenarios for HRI. Figure 3

Figure 3

Figure 3: Utensils and experiment setup demonstrate the physical context in which LLM-based planning was tested.

Figure 4

Figure 4: Illustration of success and intentional failure conditions in the utensil-passing task to mitigate over-trust issues.

Conclusion

The introduction of LLMs as zero-shot human models in HRI settings represents a significant stride toward more adaptive and human-centric robotic systems. Despite their observed limitations in spatial reasoning and sensitivity to prompt design, LLMs offer considerable strength in capturing the latent states and behaviors of humans, crucial for effective HRI.

While LLMs alone may not suffice as comprehensive human models due to their current limitations, their integration with other detailed models offers fertile ground for future research, potentially leading to more sophisticated and contextually aware robotic systems in human environments.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.