Learning to Walk in the Real World with Minimal Human Effort (2002.08550v3)

Published 20 Feb 2020 in cs.RO, cs.AI, and cs.LG

Abstract: Reliable and stable locomotion has been one of the most fundamental challenges for legged robots. Deep reinforcement learning (deep RL) has emerged as a promising method for developing such control policies autonomously. In this paper, we develop a system for learning legged locomotion policies with deep RL in the real world with minimal human effort. The key difficulties for on-robot learning systems are automatic data collection and safety. We overcome these two challenges by developing a multi-task learning procedure and a safety-constrained RL framework. We tested our system on the task of learning to walk on three different terrains: flat ground, a soft mattress, and a doormat with crevices. Our system can automatically and efficiently learn locomotion skills on a Minitaur robot with little human intervention. The supplemental video can be found at: \url{https://youtu.be/cwyiq6dCgOc}.

Authors (5)

Sehoon Ha (60 papers)
Peng Xu (357 papers)
Zhenyu Tan (4 papers)
Sergey Levine (531 papers)
Jie Tan (85 papers)

Citations (161)

View on Semantic Scholar

Summary

The paper proposes a novel deep RL framework for autonomous legged locomotion that minimizes manual intervention using multi-task learning.
It integrates safety-constrained reinforcement learning with a dual-gradient descent method to ensure safe policy training in real-world conditions.
Experimental results on a quadrupedal Minitaur demonstrate efficient gait adaptation across flat, soft, and uneven terrains with minimal manual resets.

Learning to Walk in the Real World with Minimal Human Effort

The paper presented in this paper addresses the significant challenges associated with autonomously learning legged locomotion policies using deep reinforcement learning (deep RL) in real-world environments. The research largely focuses on minimizing human involvement within the training process, which is paramount for the scalability of these systems across diverse tasks and terrains.

Key Contributions

The paper proposes a novel system that supersedes traditional hand-engineered control mechanisms, which often require substantial expertise and are only viable for limited scenarios. The authors devise a robust framework for training legged robots to navigate autonomously in real-world conditions. They address two crucial challenges—automation of the data collection process and ensuring the system operates safely during learning.

Multi-task Learning Framework: The researchers implement a multi-task learning framework that allows the robot to simultaneously learn various locomotion tasks. By parameterizing each task with a task vector, the robot can determine appropriate walking directions and adaptively switch tasks. This approach helps maintain the robot within operational boundaries without manual resets, significantly limiting human intervention.
Safety-Constrained Reinforcement Learning: To mitigate the risk of mechanical damage from falls and ensure efficient learning, a safety-constrained RL algorithm is incorporated. This algorithm uses constrained Markov Decision Processes to enforce safety conditions (e.g., limits on the robot's posture) during learning. This incorporation aligns with a dual-gradient descent method for optimizing both the reward and safety conditions simultaneously.

Experimental Results

The proposed system showcases remarkable efficacy in real-world testing, as demonstrated by the experimental results on a quadrupedal Minitaur robot. The robot successfully learned locomotion on three distinct terrains: flat ground, a soft mattress, and a doormat with crevices, all with minimal human intervention.

Reduced Human Intervention: Two of the three trials required no human interventions, while the third required only minimal manual interference. This is in stark contrast to previous methods, which required significant manual resetting. The approach also lowered data requirements substantially, with dual-task training needing fewer samples than a single-task approach.
Efficient Multi-task Training: The framework effectively demonstrated the capability to train multiple locomotion policies (walking in different directions) concurrently—forming complete skill-sets necessary for varied navigational tasks such as moving forward, backward, and turning.
Real-world Gait Learning: The robot acquired distinct, effective gaits for each terrain, varying strategies based on the surface characteristics. On the flat terrain, it developed different gaits for forward and backward motions, with adaptations observed for soft and uneven surfaces.

Implications and Future Directions

This research offers significant implications for the field of autonomous robotic systems. The effective reduction of human intervention facilitates the deployment of reinforcement learning in real-world environments, which is pivotal for tasks extending beyond controlled sandbox scenarios. Particularly, this system opens pathways for deploying robots in unstructured terrains where detailed models and hand-engineered solutions are impractical.

Future research can explore extending these methods to more complex environmental dynamics and robots with varying morphologies, leveraging domain adaptation techniques to facilitate cross-deployment without task-specific tuning. Additionally, integrating self-recovery mechanisms learned alongside locomotion policies could further enhance the practicality and resilience of autonomous robotics systems.

The contributions of this paper underscore the potential of combining multi-task learning with safety-constrained RL in solving real-world robotics problems, marking an incremental step toward autonomous robot capabilities.

PDF Markdown

Related Papers

YouTube

Show All Videos