Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

VITA: A Multi-modal LLM-based System for Longitudinal, Autonomous, and Adaptive Robotic Mental Well-being Coaching (2312.09740v1)

Published 15 Dec 2023 in cs.RO

Abstract: Recently, several works have explored if and how robotic coaches can promote and maintain mental well-being in different settings. However, findings from these studies revealed that these robotic coaches are not ready to be used and deployed in real-world settings due to several limitations that span from technological challenges to coaching success. To overcome these challenges, this paper presents VITA, a novel multi-modal LLM-based system that allows robotic coaches to autonomously adapt to the coachee's multi-modal behaviours (facial valence and speech duration) and deliver coaching exercises in order to promote mental well-being in adults. We identified five objectives that correspond to the challenges in the recent literature, and we show how the VITA system addresses these via experimental validations that include one in-lab pilot study (N=4) that enabled us to test different robotic coach configurations (pre-scripted, generic, and adaptive models) and inform its design for using it in the real world, and one real-world study (N=17) conducted in a workplace over 4 weeks. Our results show that: (i) coachees perceived the VITA adaptive and generic configurations more positively than the pre-scripted one, and they felt understood and heard by the adaptive robotic coach, (ii) the VITA adaptive robotic coach kept learning successfully by personalising to each coachee over time and did not detect any interaction ruptures during the coaching, (iii) coachees had significant mental well-being improvements via the VITA-based robotic coach practice. The code for the VITA system is openly available via: https://github.com/Cambridge-AFAR/VITA-system.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Micol Spitale (26 papers)
  2. Minja Axelsson (9 papers)
  3. Hatice Gunes (73 papers)
Citations (11)

Summary

The paper "VITA: A Multi-modal LLM-based System for Longitudinal, Autonomous, and Adaptive Robotic Mental Well-being Coaching" (Spitale et al., 2023 ) introduces VITA, a novel system employing multi-modal inputs and LLMs to deliver longitudinal, autonomous, and adaptive robotic mental well-being coaching. VITA aims to overcome the limitations of existing robotic coaches by enhancing interactivity, personalization, and conversational capabilities. The system enables robotic coaches to adapt autonomously to a coachee's multi-modal behaviors, specifically facial valence and speech duration, to deliver personalized coaching exercises designed to promote mental well-being in adults. The code for VITA is available on GitHub.

VITA leverages facial valence and speech duration as its primary modalities. Facial valence is quantified by measuring deviations in valence values derived from a coachee's facial expressions relative to their baseline. Speech duration is measured by normalizing the duration of the coachee's speech. These two metrics are then combined to generate a reward signal that is used within a reinforcement learning model. The reward function is defined as R[st]=FVt+SDtR[s_t] = FV_t + SD_t, where FVtFV_t represents facial valence and SDtSD_t represents speech duration at time tt.

The personalization of the coaching experience in VITA is achieved through a reinforcement learning (RL) framework. A DQN model learns a conversational policy π\pi: sts_t \rightarrow ata_t from the HHI4PP dataset, mapping observation states (sts_t) to actions (ata_t). The action space includes options such as summarizing, asking a follow-up question, or requesting a new episode. VITA employs an adaptive RL model that fine-tunes a pre-trained generic RL model using real-time data from each coachee. This adaptive approach allows the system to personalize its responses and dialogue flow over time, aligning with the coachee's individual behavior to maximize the reward signal, indicative of positive engagement and emotional state.

The effectiveness of VITA was evaluated in a real-world paper involving 17 employees (7 females, 10 males) at a tech company. Participants, who were pre-screened for anxiety and depression, interacted with the VITA-based robotic coach weekly over a period of 4 weeks. The robotic coach, equipped with the adaptive reinforcement learning (ADAPT-RL) model, personalized the dialogue flow based on the coachee's facial valence and speech duration. Data was collected through VITA system logs, pre- and post-paper questionnaires (including the Ryff Psychological Well-being Scale or RPWS), and post-paper semi-structured interviews. The paper's key findings indicated significant improvements in coachees' self-reported mental well-being, evidenced by increases in RPWS sub-scales related to personal growth, positive relations with others, purpose in life, and self-acceptance. The adaptive robotic coach demonstrated successful learning, indicated by an increase in the reward function over time, suggesting effective personalization. Furthermore, no interaction ruptures were automatically detected during the coaching sessions, and coachees generally expressed a positive impression of the robotic coach and found the coaching practice beneficial. However, some participants noted a perceived lack of empathy in the robot's responses.

Github Logo Streamline Icon: https://streamlinehq.com