Deep Reinforcement Learning based Automatic Exploration for Navigation in Unknown Environment (2007.11808v1)

Published 23 Jul 2020 in cs.RO

Abstract: This paper investigates the automatic exploration problem under the unknown environment, which is the key point of applying the robotic system to some social tasks. The solution to this problem via stacking decision rules is impossible to cover various environments and sensor properties. Learning based control methods are adaptive for these scenarios. However, these methods are damaged by low learning efficiency and awkward transferability from simulation to reality. In this paper, we construct a general exploration framework via decomposing the exploration process into the decision, planning, and mapping modules, which increases the modularity of the robotic system. Based on this framework, we propose a deep reinforcement learning based decision algorithm which uses a deep neural network to learning exploration strategy from the partial map. The results show that this proposed algorithm has better learning efficiency and adaptability for unknown environments. In addition, we conduct the experiments on the physical robot, and the results suggest that the learned policy can be well transfered from simulation to the real robot.

View on arXiv

Authors (3)

Haoran Li (166 papers)
Qichao Zhang (27 papers)
Dongbin Zhao (62 papers)

Citations (173)

View on Semantic Scholar

Summary

Deep Reinforcement Learning Based Automatic Exploration for Navigation in Unknown Environment

This paper discusses a novel approach to automatic exploration in unknown environments using a Deep Reinforcement Learning (DRL) framework. The focus is on improving the adaptability and efficiency of robotic systems in performing autonomous navigation tasks without prior knowledge of the environment.

Traditional methods for robot exploration, such as frontier-based and information-based techniques, have sought to devise strategies based on expert map features and complex optimization processes. These approaches, while valuable, often fall short in dealing with diverse environmental scenarios and fail to provide adequate generalization across different environments. Furthermore, computational burdens increase rapidly with the complexity and size of the environments being mapped.

The authors present a structured framework that decomposes the exploration process into decision-making, planning, and mapping modules, thereby enhancing the modularity of the robotic system. This approach facilitates the integration of established navigation techniques—exemplified by SLAM and path planning algorithms—alongside a proposed DRL-based decision algorithm. This decision-making module focuses on learning exploration strategies through interactions in partial maps perceived by robots.

The numerical results demonstrated in the paper highlight significant improvements in learning efficiency and adaptability when compared to previous methods. These findings are supported by a series of simulation and real-world experiments where transferring policies learned in simulation to real robots resulted in effective exploration strategies. This signifies an important step towards overcoming the reality gap commonly faced in DRL applications, where discrepancies between simulated and actual environments hinder the performance transferability.

The DRL algorithm is underpinned by a Fully Convolutional Q-Network (FCQN) architecture enhanced with an auxiliary edge segmentation task. This auxiliary task serves to improve training efficiency and map interpretation by guiding the exploration strategy through pertinent edge features, namely obstacles and frontiers. Integrating this feature extraction paradigm with DRL substantially reduces the complexity of the decision process and enhances the robustness of resulting policies.

Practically, this research bears implications for various applications, such as rescue robot operations and domestic cleaning automation. The theoretical contributions, particularly the modular exploration framework, open pathways for further investigations into multirobot cooperation, real-time adaptation, and more sophisticated map representation models.

Future developments could explore integrating hierarchical decision-making processes, continuous action spaces, and more nuanced SLAM alternatives such as Hilbert maps. Additionally, leveraging recurrent networks could augment the framework's capability in dealing with dynamically changing environments by embedding historical context—a promising avenue to refine the interaction between exploration, mapping fidelity, and computational efficiency in intelligent systems development.

Overall, this paper offers a solid stride towards applying DRL in autonomous navigation, proposing a scalable and adaptable solution that can be generalized across varying robotic platforms and environmental conditions.

Related Papers

Find Related Papers