Reinforcement Learned Distributed Multi-Robot Navigation with Reciprocal Velocity Obstacle Shaped Rewards (2203.10229v1)

Published 19 Mar 2022 in cs.RO

Abstract: The challenges to solving the collision avoidance problem lie in adaptively choosing optimal robot velocities in complex scenarios full of interactive obstacles. In this paper, we propose a distributed approach for multi-robot navigation which combines the concept of reciprocal velocity obstacle (RVO) and the scheme of deep reinforcement learning (DRL) to solve the reciprocal collision avoidance problem under limited information. The novelty of this work is threefold: (1) using a set of sequential VO and RVO vectors to represent the interactive environmental states of static and dynamic obstacles, respectively; (2) developing a bidirectional recurrent module based neural network, which maps the states of a varying number of surrounding obstacles to the actions directly; (3) developing a RVO area and expected collision time based reward function to encourage reciprocal collision avoidance behaviors and trade off between collision risk and travel time. The proposed policy is trained through simulated scenarios and updated by the actor-critic based DRL algorithm. We validate the policy in complex environments with various numbers of differential drive robots and obstacles. The experiment results demonstrate that our approach outperforms the state-of-art methods and other learning based approaches in terms of the success rate, travel time, and average speed. Source code of this approach is available at https://github.com/hanruihua/rl_rvo_nav.

PDF Abstract

An Overview of Reinforcement Learned Distributed Multi-Robot Navigation with Reciprocal Velocity Obstacle Shaped Rewards

The paper "Reinforcement Learned Distributed Multi-Robot Navigation with Reciprocal Velocity Obstacle Shaped Rewards" introduces a novel distributed approach to multi-robot navigation, leveraging the combination of Reciprocal Velocity Obstacles (RVO) and Deep Reinforcement Learning (DRL) to tackle the problem of reciprocal collision avoidance under constrained information environments. This work aims to address inherent challenges in decentralized multi-robot systems, focusing on creating adaptive, efficient navigation policies that do not rely on centralized control, thus facilitating scalable deployment in complex environments with dynamic and static obstacles.

Key Contributions

The contributions of this work are multifold:

Environmental State Representation: The authors propose a novel environmental state representation using a combination of VO and RVO vectors to model dynamic agent interactions and static obstacles explicitly. This representation captures intricate collision avoidance interactions effectively, allowing robots to autonomously navigate through cluttered environments.
Neural Network Design: A specialized bi-directional recurrent neural network architecture is developed to map the continuous states of surrounding obstacles into control actions. The use of Bidirectional Gated Recurrent Units (BiGRUs) is instrumental in processing variable-length sequential inputs, enabling a comprehensive analysis of spatial dynamics from both forward and backward perspectives.
Reward Function: The design of the reward function in this paper integrates RVO areas and expected collision times, providing an incentive structure for robots to achieve reciprocal collision avoidance through a balanced approach to collision risk and transit efficiency.

Experimental Outcomes

Experiments demonstrated the superiority of the proposed approach over existing methods such as SARL, GA3C-CADRL, and NH-ORCA in terms of success rate, travel time, and average speed across a variety of simulated environments. The utilization of the RVO framework within a DRL context significantly enhanced the adaptability and efficiency of navigation policies, particularly in densely populated scenarios. Notably, the system showed robustness and lower computational overhead, making it feasible for deployment in real-time applications.

Implications and Future Directions

The implications of this work span both practical and theoretical domains in robotics and autonomous systems. Practically, the proposed framework offers a decentralized navigation policy that can be scaled to numerous robots without intensive computational resources or stringent communication requirements. Theoretically, the successful integration of RVO concepts into DRL frameworks opens avenues for further exploration of collision avoidance strategies in dynamic and uncertain environments.

Future developments in this area might focus on extending the approach to more complex real-world applications with higher variability and unpredictability, incorporating elements such as non-holonomic constraints in greater detail, and addressing multi-agent coordination under partial observability. The framework's potential can also be harnessed to enhance collaborative and cooperative behaviors in robot swarms, potentially impacting fields like warehouse automation and disaster response.

In summary, this work takes significant strides towards resolving critical challenges in multi-robot systems, contributing to the development of more autonomous, intelligent, and efficient robotic networks. Through careful innovation in state representation, learning architecture, and reward mechanisms, the paper provides a robust platform upon which future research can build.

PDF Markdown Bookmark Chat (Pro)

Authors (7)

Ruihua Han (17 papers)
Shengduo Chen (4 papers)
Shuaijun Wang (4 papers)
Zeqing Zhang (14 papers)
Rui Gao (72 papers)
Qi Hao (53 papers)
Jia Pan (127 papers)

Citations (79)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - hanruihua/rl_rvo_nav: The source code of the [RA-L] paper "Reinforcement Learned Distributed Multi-Robot Navigation with Reciprocal Velocity Obstacle Shaped Rewards" (163 stars)