Two-step dynamic obstacle avoidance (2311.16841v2)

Published 28 Nov 2023 in cs.RO and cs.AI

Abstract: Dynamic obstacle avoidance (DOA) is a fundamental challenge for any autonomous vehicle, independent of whether it operates in sea, air, or land. This paper proposes a two-step architecture for handling DOA tasks by combining supervised and reinforcement learning (RL). In the first step, we introduce a data-driven approach to estimate the collision risk (CR) of an obstacle using a recurrent neural network, which is trained in a supervised fashion and offers robustness to non-linear obstacle movements. In the second step, we include these CR estimates into the observation space of an RL agent to increase its situational awareness. We illustrate the power of our two-step approach by training different RL agents in a challenging environment that requires to navigate amid multiple obstacles. The non-linear movements of obstacles are exemplarily modeled based on stochastic processes and periodic patterns, although our architecture is suitable for any obstacle dynamics. The experiments reveal that integrating our CR metrics into the observation space doubles the performance in terms of reward, which is equivalent to halving the number of collisions in the considered environment. We also perform a generalization experiment to validate the proposal in an RL environment based on maritime traffic and real-world vessel trajectory data. Furthermore, we show that the architecture's performance improvement is independent of the applied RL algorithm.

References (60)

Citations (1)

View on Semantic Scholar

Summary

The paper presents a two-step architecture that combines supervised learning for collision risk estimation with reinforcement learning to significantly reduce collision rates.
It leverages an LSTM model to predict non-linear obstacle trajectories, enabling precise calculation of collision risk for informed decision-making.
Empirical results demonstrate doubled rewards and halved collision rates in simulations, highlighting the method's robustness and scalability.

Overview

Dynamic obstacle avoidance (DOA) is essential for the safety of autonomous vehicles in environments like waterways, airspace, or on the road. The paper introduces a novel two-step approach that combines supervised and reinforcement learning to handle DOA tasks. The proposed method assesses collision risks and utilizes this information to improve the decision-making of an autonomous agent. The paper proves the approach's capability to substantially reduce the number of collisions, implying its potential for enhancing autonomous navigation systems.

Step 1: Estimating Collision Risk

The first step leverages supervised learning to predict the trajectory of an obstacle, accounting for potential non-linear movements. Specifically, a recurrent neural network (LSTM) estimates potential future paths based on past observations, allowing for calculation of collision risk metrics. This method captures complex movements and provides valuable insights into potential points of closest approach (CPA).

Step 2: Integration with Reinforcement Learning

With collision risk estimates at hand, the second step incorporates these metrics into the observation space of a reinforcement learning (RL) agent. By augmenting the observation space with the LSTM-generated estimates, the RL agent gains a richer understanding of the environment. This enhanced observation enables the agent to make better-informed decisions, effectively doubling the rewards and halving the collision rates in simulations.

Empirical Performance

The effectiveness of the two-step architecture was rigorously tested in a challenging environment, simulating the movement of multiple obstacles using both stochastic and periodic patterns. Independent of the choice of RL algorithm, integrating collision risk estimates consistently improved performance. The paper's findings emphasize the proposed method's algorithmic robustness and scalability, demonstrating its potential for a wide range of applications in autonomous systems.

Conclusion

The research concludes that autonomous systems could significantly benefit from employing the two-step architecture for dynamic obstacle avoidance. Future work will focus on enhancing the architecture with real-world data and exploring the influence of uncertainty in collision risk estimates on decision-making. The improvements in safety and efficiency brought by this approach contribute meaningfully towards the development of advanced autonomous transportation networks.

PDF Markdown