Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement Learning (1805.01956v1)

Published 4 May 2018 in cs.RO, cs.AI, and cs.LG

Abstract: Robots that navigate among pedestrians use collision avoidance algorithms to enable safe and efficient operation. Recent works present deep reinforcement learning as a framework to model the complex interactions and cooperation. However, they are implemented using key assumptions about other agents' behavior that deviate from reality as the number of agents in the environment increases. This work extends our previous approach to develop an algorithm that learns collision avoidance among a variety of types of dynamic agents without assuming they follow any particular behavior rules. This work also introduces a strategy using LSTM that enables the algorithm to use observations of an arbitrary number of other agents, instead of previous methods that have a fixed observation size. The proposed algorithm outperforms our previous approach in simulation as the number of agents increases, and the algorithm is demonstrated on a fully autonomous robotic vehicle traveling at human walking speed, without the use of a 3D Lidar.

Authors (3)

Michael Everett (40 papers)
Yu Fan Chen (4 papers)
Jonathan P. How (159 papers)

Citations (490)

View on Semantic Scholar

Summary

The paper presents GA3C-CADRL, a DRL-based framework for decentralized collision avoidance among dynamic, decision-making agents.
It integrates LSTM cells to convert variable-length multi-agent observations into fixed-length vectors, enhancing adaptive motion planning.
Simulation and real-world trials show improved collision avoidance and path efficiency in complex, crowded environments.

Motion Planning Among Dynamic, Decision-Making Agents Using Deep Reinforcement Learning

The paper "Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement Learning" presents an innovative approach to decentralized collision avoidance for autonomous robots operating in environments populated with dynamic agents such as pedestrians. Unlike traditional methods that rely on fixed assumptions about agent behaviors, this work employs deep reinforcement learning (DRL) to model complex interactions in a more flexible manner, particularly using Long Short-Term Memory (LSTM) networks to handle varying numbers of agents.

Overview

The authors address a significant challenge in robotic motion planning, namely, how to safely and efficiently navigate environments where other agents are both dynamic and decision-making entities. Prior approaches often limited their applicability by assuming homogeneity among agents or fixed behavior models, leading to performance degradation as the environment complexity increased. This work proposes a novel DRL-based framework, GA3C-CADRL, which eschews such assumptions. The algorithm utilizes an actor-critic model known as Asynchronous Advantage Actor-Critic (A3C) with GPU/CPU enhancements tailored for this application.

Methodology

Key to this approach is the integration of LSTM cells to process observations from an arbitrary number of agents, converting variable-length inputs into a fixed-length vector that informs decision-making. This mechanism is adapted from techniques commonly used in Natural Language Processing, allowing the system to encode and manage the dynamics of the environment without constraining the number of participants. The policy effectively leverages the LSTM-encoded state for action selection, which was validated through extensive simulation trials and real-world robotic demonstrations.

The training process is initiated with supervised learning using data from existing CADRL solutions, followed by deep reinforcement learning, which refines the policy across numerous episodes involving varying complexity and numbers of agents. This iterative training framework contributes to the method's adaptability to different operational contexts.

Results and Implications

In controlled simulations, GA3C-CADRL showed substantial improvement over previous models, particularly in multi-agent scenarios where traditional models faltered. The simulation results highlighted that the new model maintains a high success rate in collision avoidance even as the number of agents increases, while demonstrating slight improvements in path efficiency. The robot was shown to navigate effectively in real-world settings at human walking speeds without 3D lidar, an achievement facilitated by integrating multi-sensor data processed through the described DRL approach.

Practical implications of this research are significant, as it brings improvements in computational efficiency and adaptability, which are crucial for real-time applications in crowded environments. The reduction in the need for structured assumptions makes this framework applicable in a broad range of settings, from autonomous vehicles to service robots operating in pedestrian spaces.

Future Directions

The flexibility and performance gains observed suggest further exploration, particularly in integrating more sensory modalities and investigating the implications of agent signaling and its impact on collective behavior. Another area of interest could be the application of this approach to other domains requiring complex, decentralized decision-making, such as swarm robotics or urban traffic management.

This research contributes meaningfully to advancements in autonomous navigation, presenting a robust framework that adapts learning-based methodologies to real-world challenges efficiently and effectively. The results bolster the potential of DRL in dynamic, interactive environments, marking a step forward in intelligent motion planning.

PDF Markdown

Related Papers

GitHub

GitHub - mit-acl/cadrl_ros: ROS package for dynamic obstacle avoidance for ground robots trained with deep RL (641 stars)

YouTube

Show All Videos