SELFI: Autonomous Self-Improvement with Reinforcement Learning for Social Navigation

Published 1 Mar 2024 in cs.RO, cs.CV, and cs.LG | (2403.00991v2)

Abstract: Autonomous self-improving robots that interact and improve with experience are key to the real-world deployment of robotic systems. In this paper, we propose an online learning method, SELFI, that leverages online robot experience to rapidly fine-tune pre-trained control policies efficiently. SELFI applies online model-free reinforcement learning on top of offline model-based learning to bring out the best parts of both learning paradigms. Specifically, SELFI stabilizes the online learning process by incorporating the same model-based learning objective from offline pre-training into the Q-values learned with online model-free reinforcement learning. We evaluate SELFI in multiple real-world environments and report improvements in terms of collision avoidance, as well as more socially compliant behavior, measured by a human user study. SELFI enables us to quickly learn useful robotic behaviors with less human interventions such as pre-emptive behavior for the pedestrians, collision avoidance for small and transparent objects, and avoiding travel on uneven floor surfaces. We provide supplementary videos to demonstrate the performance of our fine-tuned policy on our project page.

Abstract PDF HTML Upgrade to Chat

References (49)

Citations (4)

View on Semantic Scholar

Summary

The paper presents SELFI, which integrates reinforcement learning techniques to fine-tune pre-trained social navigation policies for robots.
It combines model-free online learning with offline model-based learning to stabilize and accelerate policy improvement in complex, real-world scenarios.
Extensive tests demonstrate that SELFI enables robots to navigate seamlessly around pedestrians and obstacles with minimal human intervention.

Overview

In recent developments, the endeavor to enhance robotic systems' efficiency and effectiveness in real-world applications has seen a notable shift toward enabling these systems to learn and adapt through experience. A significant contribution to this field is presented in the study titled "SELFI: Autonomous Self-Improvement with Reinforcement Learning for Social Navigation", where the authors introduce SELFI, an online learning method designed for fine-tuning pre-trained control policies of robots engaged in social navigation tasks. This method amalgamates online model-free reinforcement learning (RL) with offline model-based learning to synthesize the strengths of both approaches, thus enabling rapid and efficient policy improvement.

Methodology

SELFI stabilizes the online learning process by incorporating a model-based learning objective, utilized during the offline pre-training phase, into the Q-values learned through online model-free reinforcement learning. This integration facilitates the fine-tuning of control policies in real-world environments by enabling robots to learn from online experiences without significant human intervention. The method is particularly tailored for social navigation, where robots must navigate indoor spaces while avoiding obstacles and maintaining socially compliant behavior around pedestrians.

The SELFI framework consists of several key components:

Model-Free Reinforcement Learning: At its core, SELFI employs model-free RL to learn from real-world interactions, with the ultimate goal of maximizing the expected sum of future rewards. It leverages actor-critic methods to optimize both policy and action-value functions.
Model-Based Learning Objective: A novel aspect of SELFI is its use of a hybrid objective that combines a learned model-free critic with a pre-existing model-based trajectory value estimate. This combination allows the robot to begin its online learning phase with a reasonable approximation of desired behaviors, facilitating a smoother learning process and enabling more rapid improvements.
Social Navigation in Real-World Environments: The application of SELFI to social navigation is thoroughly evaluated. The method not only improves basic navigational capabilities, such as collision avoidance, but also enhances the robot's performance in socially relevant aspects, such as preemptively avoiding pedestrians and navigating smoothly around small or transparent obstacles.
Fine-Tuning and Behavioral Improvement: Through extensive real-world testing, SELFI demonstrates its capability to fine-tune pre-trained policies effectively. Robots are able to adapt to specific environmental challenges, learning complex behaviors like avoiding uneven floor surfaces, which would be difficult to encode directly through offline model-based methods or learn efficiently from scratch using model-free RL alone.

Implications and Future Directions

The SELFI framework represents a significant step forward in the field of robotic learning, especially in contexts requiring nuanced interaction with humans and complex navigation tasks. By leveraging the strengths of both online and offline learning methods, SELFI enables robots to adapt more effectively to their operational environments, reducing the need for human intervention during the learning process.

The implications of this research are vast, potentially impacting a wide range of applications from service robots in public spaces to assistive devices in healthcare settings. Future work could explore the integration of more complex social behaviors into the SELFI framework, further enhancing robots' ability to seamlessly integrate into human environments. Expanding the methodology to include diverse learning objectives and environmental contexts will also be crucial for advancing the state of autonomous robotic systems.

In conclusion, the development of SELFI marks a promising advancement in the quest for creating autonomous robots capable of self-improvement through reinforcement learning. Its successful application in social navigation tasks opens up new avenues for research and development, potentially leading to more adaptable, efficient, and socially aware robotic systems in the future.