Decentralized Non-communicating Multiagent Collision Avoidance with Deep Reinforcement Learning (1609.07845v2)

Published 26 Sep 2016 in cs.MA

Abstract: Finding feasible, collision-free paths for multiagent systems can be challenging, particularly in non-communicating scenarios where each agent's intent (e.g. goal) is unobservable to the others. In particular, finding time efficient paths often requires anticipating interaction with neighboring agents, the process of which can be computationally prohibitive. This work presents a decentralized multiagent collision avoidance algorithm based on a novel application of deep reinforcement learning, which effectively offloads the online computation (for predicting interaction patterns) to an offline learning procedure. Specifically, the proposed approach develops a value network that encodes the estimated time to the goal given an agent's joint configuration (positions and velocities) with its neighbors. Use of the value network not only admits efficient (i.e., real-time implementable) queries for finding a collision-free velocity vector, but also considers the uncertainty in the other agents' motion. Simulation results show more than 26 percent improvement in paths quality (i.e., time to reach the goal) when compared with optimal reciprocal collision avoidance (ORCA), a state-of-the-art collision avoidance strategy.

Authors (4)

Yu Fan Chen (4 papers)
Miao Liu (98 papers)
Michael Everett (40 papers)
Jonathan P. How (159 papers)

Citations (554)

View on Semantic Scholar

Summary

Decentralized Non-Communicating Multiagent Collision Avoidance with Deep Reinforcement Learning

The paper presents a decentralized algorithm for collision avoidance in multiagent systems using deep reinforcement learning (DRL). The authors address the challenge of finding feasible, collision-free paths in scenarios where agents do not communicate their intents, leveraging DRL to shift the computational burden from online execution to offline learning. This innovation enables efficient, real-time path planning by employing a value network that anticipates and encodes the expected time to a goal, accommodating uncertainties in other agents' motion.

Problem Formulation and Methodology

This work frames the collision avoidance problem as a partially-observable sequential decision-making task. Each agent's state is divided into observable (positions, velocities, and sizes) and hidden components (goals, preferred speed, and heading angles). The objective is to minimize the expected time to reach goals while avoiding collisions, which are managed via probabilistic state transitions.

The authors opt for a reinforcement learning approach, specifically using a deep neural network to parameterize the value function. This network guides an agent's policy through a one-step lookahead mechanism, ensuring that actions taken account for future interactions in a computationally efficient manner. Simulations demonstrate DRL's potential by showing significant path quality improvements compared to state-of-the-art algorithms like Optimal Reciprocal Collision Avoidance (ORCA).

Key Contributions and Results

The paper makes several notable contributions:

Two-Agent Collision Avoidance Algorithm: The introduction of a DRL-based approach that efficiently handles the interactive complexities between two agents.
Multiagent Generalization: Extending the approach to scenarios with more than two agents, offering a systematic methodology for broader applicability.
Handling Kinematic Constraints: Integrating rotational and other kinematic constraints within the DRL framework to accommodate realistic operational conditions.
Enhanced Simulation Performance: Demonstrating more than 26% improvement in path quality over ORCA, highlighting the proposed method's efficacy.
Real-Time Implementation: Real-time applicability shown through practical simulation of decentralized systems with up to ten agents.

Implications and Future Directions

Practically, this research offers a scalable solution for multiagent systems without reliable communication, a common scenario in applications like autonomous vehicle navigation and robotics. Theoretically, it expands the application of DRL in navigating partially observable environments without centralized planning or extensive communication.

Future research might delve into improving generalization across diverse environments or integrating memory units within the neural network to better infer intents based on past interactions. Another interesting avenue could be experimentally validating these approaches in real-world scenarios or considering cooperative learning paradigms that could enhance non-communicative coordination capabilities.

This work merges deep learning and robotics, proposing a robust system that balances computational tractability and implementation fidelity in multiagent collision avoidance tasks, marking an advancement in agent autonomy and interaction modeling.

PDF Markdown

Related Papers

Find Related Papers

YouTube

Show All Videos