Increasing Transparency of Reinforcement Learning using Shielding for Human Preferences and Explanations (2311.16838v1)

Published 28 Nov 2023 in cs.RO

Abstract: The adoption of Reinforcement Learning (RL) in several human-centred applications provides robots with autonomous decision-making capabilities and adaptability based on the observations of the operating environment. In such scenarios, however, the learning process can make robots' behaviours unclear and unpredictable to humans, thus preventing a smooth and effective Human-Robot Interaction (HRI). As a consequence, it becomes crucial to avoid robots performing actions that are unclear to the user. In this work, we investigate whether including human preferences in RL (concerning the actions the robot performs during learning) improves the transparency of a robot's behaviours. For this purpose, a shielding mechanism is included in the RL algorithm to include human preferences and to monitor the learning agent's decisions. We carried out a within-subjects study involving 26 participants to evaluate the robot's transparency in terms of Legibility, Predictability, and Expectability in different settings. Results indicate that considering human preferences during learning improves Legibility with respect to providing only Explanations, and combining human preferences with explanations elucidating the rationale behind the robot's decisions further amplifies transparency. Results also confirm that an increase in transparency leads to an increase in the safety, comfort, and reliability of the robot. These findings show the importance of transparency during learning and suggest a paradigm for robotic applications with human in the loop.

Summary

The paper introduces a shielding technique that integrates human preferences into RL to align robot actions with user expectations.
The study shows that combining preferences with explanations significantly improves legibility, predictability, and expectability in robot navigation.
The research demonstrates that transparent RL enhances user trust and perceived safety, fostering effective human-robot collaboration.

Increasing Transparency of Reinforcement Learning using Shielding for Human Preferences and Explanations

Transparency in Reinforcement Learning (RL) for autonomous systems, especially in human-centered applications, is a critical challenge in Human-Robot Interaction (HRI). The paper "Increasing Transparency of Reinforcement Learning using Shielding for Human Preferences and Explanations" investigates the integration of human preferences and explanatory mechanisms to enhance the transparency of robotic behaviors. The research aims to ensure users can comprehend and predict robot actions, thereby fostering more effective collaboration.

Context and Motivation

In non-industrial settings, robots need to adaptively acquire new skills as pre-programming for all potential scenarios is impractical. The adaptability provided by RL is advantageous; however, it may lead to opaque and unpredictable behaviors during the learning phase, creating barriers to effective HRI. The lack of transparency can result in users perceiving robots as "black boxes," undermining trust and usability. This paper posits that incorporating human preferences into the RL process can mitigate this issue, promoting transparent robot behaviors.

Methodology

The researchers propose a novel RL mechanism that utilizes a shielding technique to integrate human preferences. This mechanism diverges from conventional safe RL approaches by not only avoiding unsafe actions but also aligning the robot's actions with user preferences. The shielding mechanism supervises the decision-making process within the RL framework, evaluating and potentially altering the robot's chosen actions based on safety and user preferences. This method also includes providing explanations to users whenever the shield modifies an action, thereby addressing the components of Legibility, Predictability, and Expectability which are central to transparency.

Experimental Framework

The efficacy of this approach was evaluated through a user paper involving 26 participants in a Gridworld navigation scenario where robots had to navigate obstacles to reach a goal state. The paper examined four learning mechanisms:

Learning 1: Incorporates human preferences without providing explanations.
Learning 2: Neither incorporates preferences nor provides explanations.
Learning 3: Incorporates preferences and provides explanations.
Learning 4: Provides explanations but does not incorporate preferences.

Quantitative and Qualitative Findings

The paper's results are robust, indicating that incorporating human preferences and providing explanations significantly enhances transparency. Specifically:

Legibility: The combination of preferences and explanations (Learning 3) significantly improved the Legibility of the robot's actions compared to other mechanisms.
Predictability and Expectability: Learning 3 also outperformed other mechanisms, suggesting that users could better anticipate and understand the robot's behavior.

Transparency was quantified using a composite measure derived from Legibility, Predictability, and Expectability, with Learning 3 demonstrating the highest scores. Additionally, the incorporation of human preferences and explanations increased participants' perceived safety, comfort, and reliability in their interactions with the robot.

Theoretical and Practical Implications

From a theoretical perspective, the findings underscore the importance of incorporating user preferences in RL to enhance transparency, beyond merely avoiding unsafe actions. This approach aligns RL more closely with human expectations, making robotic behaviors more predictable and understandable.

Practically, these results have significant implications for the design of socially interactive robots. By enhancing transparency, robots become more reliable and acceptable in human-centric environments, potentially accelerating their integration into daily life. Moreover, this mechanism can be applied in various domains, from assistive robotics to autonomous vehicles, where seamless human-machine collaboration is essential.

Future Directions

Future research should extend these findings to more complex scenarios and real-world applications, evaluating the scalability and robustness of the proposed shielding mechanism. Investigating the impact of more sophisticated explanatory frameworks and user modeling techniques on transparency can further refine this approach. Furthermore, balancing the computational cost of the shielding mechanism with the benefits of faster convergence and improved transparency remains an open challenge.

Conclusion

The paper makes a notable contribution to the field of HRI by proposing a method that enhances the transparency of RL through integrating human preferences and explanatory mechanisms. This approach not only ensures safer interactions but also aligns robot behaviors more closely with human expectations, fostering trust and cooperation. As autonomous systems become increasingly prevalent, such methodologies will be crucial for their effective and harmonious integration into human environments.

PDF Markdown