DoorGym: A Scalable Door Opening Environment And Baseline Agent (1908.01887v4)

Published 5 Aug 2019 in cs.RO, cs.AI, and cs.LG

Abstract: In order to practically implement the door opening task, a policy ought to be robust to a wide distribution of door types and environment settings. Reinforcement Learning (RL) with Domain Randomization (DR) is a promising technique to enforce policy generalization, however, there are only a few accessible training environments that are inherently designed to train agents in domain randomized environments. We introduce DoorGym, an open-source door opening simulation framework designed to utilize domain randomization to train a stable policy. We intend for our environment to lie at the intersection of domain transfer, practical tasks, and realism. We also provide baseline Proximal Policy Optimization and Soft Actor-Critic implementations, which achieves success rates between 0% up to 95% for opening various types of doors in this environment. Moreover, the real-world transfer experiment shows the trained policy is able to work in the real world. Environment kit available here: https://github.com/PSVL/DoorGym/

Authors (6)

Yusuke Urakami (1 paper)
Alec Hodgkinson (4 papers)
Casey Carlin (1 paper)
Randall Leu (1 paper)
Luca Rigazio (10 papers)
Pieter Abbeel (372 papers)

Citations (55)

View on Semantic Scholar

Summary

The paper introduces a novel simulation environment that integrates domain randomization with realistic visual rendering to enhance RL policy training.
It demonstrates baseline implementations using PPO and SAC with success rates from 0% to 95%, highlighting the sensitivity of performance to scenario complexity.
A real-world validation showed a PPO policy achieving a 59% success rate and accurate doorknob localization around 4.95 cm, proving effective simulation-to-reality transfer.

Overview of DoorGym: A Scalable Door Opening Environment and Baseline Agent

The paper introduces DoorGym, an open-source simulation framework designed to advance the development of robust robotic policies for door opening tasks. DoorGym facilitates Reinforcement Learning (RL) using Domain Randomization (DR), an effective technique for enhancing policy generalization across tasks that entail interacting with a variety of door types and environmental conditions.

Key Contributions

DoorGym fills a significant void by providing a highly customizable platform that integrates domain transfer, practical tasks, and realistic simulation. This novel environment is based on the architecture of the OpenAI Gym framework and utilizes the Unity Game Engine to render visually realistic simulations, overcoming limitations in visual fidelity when using physics engines like MuJoCo.

Simulation and Environment Design: DoorGym enables robust training of RL agents by providing various randomizable parameters, including door knob types, door dynamics, and visual elements of the simulation, thereby ensuring agents are exposed to a diverse range of scenarios. This diversity is imperative for achieving effective real-world policy transfer.
Baseline Agents and Results: The authors present baseline Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) implementations, reporting success rates ranging from 0% to 95% across different door opening tasks. This performance variability underscores the importance of scenario complexity and emphasizes the necessity for well-tuned algorithms in RL.
Real-World Transfer Experimentation: The paper conducts an intriguing real-world experiment to validate the simulation-to-reality performance of trained policies. The PPO-based policy achieved a noteworthy 59% success rate in opening a physical door, illuminating the potential of simulation environments complemented by DR in real-world robotics applications.

Technical Insights and Results

The paper provides compelling evidence that PPO outperforms SAC in this domain, particularly concerning policy exploitation capabilities. The success in utilizing DR to train policies that generalize effectively across randomized environments highlights the importance of preparing agents for unseen operational conditions.

For instance, PPO achieved a 95% or higher success rate in specific configurations when using simulated doorknob positions with ground truth data. However, integrating a vision network for knob position estimation exposed challenges, as misalignment errors significantly impacted task success rates, falling to 48% under certain conditions.

The real-world applicability of the vision network trained in a DR-enriched environment exhibited promising results, generalizing to real-world doorknob localization with an accuracy of approximately 4.95 cm.

Implications and Future Directions

The introduction of DoorGym has substantial implications for robotic system design, offering a versatile platform that facilitates the uncovering of complex sensory-motor patterns. The ability to incorporate a wide range of customizable environments means that robotics practitioners can simulate challenging scenarios with high fidelity, thus refining robotic capabilities in both simulated and real-world settings.

Looking forward, potential expansions to DoorGym could include more complex task models, such as tackling locked doors and broader doorknob types, which could further model the sophistication of human-like manipulation tasks. The exploration of multi-agent scenarios could also offer deep insights into collaborative robotic tasks.

DoorGym serves as a strategic contribution to bridging the gap between simulation and reality in robotics, advancing techniques for DR, and catalyzing further developments in the robust deployment of RL agents in dynamic environments. As a benchmark, DoorGym opens avenues for future research aimed at enhancing RL frameworks and facilitating rapid advancements in robot autonomy and adaptability.

PDF Markdown

Related Papers

GitHub

GitHub - PSVL/DoorGym: Open source domain randomized door opening training environment (101 stars)

YouTube

Show All Videos