PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based Planning (1710.03937v2)

Published 11 Oct 2017 in cs.AI, cs.LG, and cs.RO

Abstract: We present PRM-RL, a hierarchical method for long-range navigation task completion that combines sampling based path planning with reinforcement learning (RL). The RL agents learn short-range, point-to-point navigation policies that capture robot dynamics and task constraints without knowledge of the large-scale topology. Next, the sampling-based planners provide roadmaps which connect robot configurations that can be successfully navigated by the RL agent. The same RL agents are used to control the robot under the direction of the planning, enabling long-range navigation. We use the Probabilistic Roadmaps (PRMs) for the sampling-based planner. The RL agents are constructed using feature-based and deep neural net policies in continuous state and action spaces. We evaluate PRM-RL, both in simulation and on-robot, on two navigation tasks with non-trivial robot dynamics: end-to-end differential drive indoor navigation in office environments, and aerial cargo delivery in urban environments with load displacement constraints. Our results show improvement in task completion over both RL agents on their own and traditional sampling-based planners. In the indoor navigation task, PRM-RL successfully completes up to 215 m long trajectories under noisy sensor conditions, and the aerial cargo delivery completes flights over 1000 m without violating the task constraints in an environment 63 million times larger than used in training.

Authors (7)

Aleksandra Faust (60 papers)
Oscar Ramirez (5 papers)
Marek Fiser (7 papers)
Kenneth Oslund (3 papers)
Anthony Francis (76 papers)
James Davidson (15 papers)
Lydia Tapia (6 papers)

Citations (278)

View on Semantic Scholar

Summary

PRM-RL: Advancements in Long-range Robotic Navigation

The paper introduces PRM-RL, a novel hierarchical approach designed to enhance long-range navigation tasks in robotics by integrating reinforcement learning (RL) agents with sampling-based planning methods, specifically Probabilistic Roadmaps (PRMs). This combination leverages the strengths of both methodologies to address the inherent challenges in navigation tasks that involve complex environments and non-trivial robot dynamics.

Core Methodology

PRM-RL operates by using reinforcement learning agents to handle local navigation tasks, where they master point-to-point goals despite the locally unpredictable environment due to sensor noise and dynamic constraints. These agents are agnostic to the global topology but proficient in managing the dynamics of the robot and maintaining task constraints locally. The broader navigation plan is achieved by constructing roadmaps, where the connectivity between nodes is determined by the RL agents' successful navigation between these points. This is in contrast to traditional PRMs which frequently use simple collision-free paths calculated via C-space interpolation.

Numerical Results and Claim Validation

The paper offers substantial numerical evidence to support the effectiveness of PRM-RL in handling tasks that surpass the capabilities of both stand-alone RL agents and traditional sampling-based planners. Specifically, the PRM-RL approach sees completion of complex indoor navigation tasks featuring trajectories as long as 215 meters. For aerial cargo delivery, it achieves flights over 1000 meters while respecting load displacement constraints, in environments significantly larger than those used for training. The PRM-RL also demonstrates the ability to consistently perform under conditions with noisy sensors and dynamic obstacles, traits less robustly managed by PRMs or RL agents individually.

Implications and Future Directions

The fusion of RL and PRMs in PRM-RL is significant for both theoretical and practical robotics applications. It highlights the potential for RL to influence classical sampling-based planning, extending its functionality into dynamically complex and sensor-noisy environments. From a practical standpoint, this methodology can be applied to real-world robotics challenges such as autonomous vehicle navigation, search and rescue operations, and logistics via unmanned aerial vehicles (UAVs), where safety and obstacle avoidance are paramount.

Further research could explore the scalability of PRM-RL into even more expansive or more dynamically challenging environments, potentially integrating more sophisticated forms of reinforcement learning that utilize hierarchical or multi-agent systems. Enhancing the PRMs component with adaptive learning methods could further adapt the roadmap in real-time, incorporating changes in task dynamics or environmental characteristics.

Conclusion

PRM-RL represents a meaningful integration of traditional sampling-based planning with modern learning-based techniques, establishing a paradigm where the complexities of local dynamics and global navigation can be managed symbiotically. The paper effectively argues for a new direction in robotic navigation that embraces the strengths of both deterministic and probabilistic planning domains, opening strategic pathways for future research and application in advanced robotics systems.

PDF Markdown

Related Papers

YouTube

Show All Videos