DexPilot: Vision Based Teleoperation of Dexterous Robotic Hand-Arm System (1910.03135v2)

Published 7 Oct 2019 in cs.CV, cs.LG, and cs.RO

Abstract: Teleoperation offers the possibility of imparting robotic systems with sophisticated reasoning skills, intuition, and creativity to perform tasks. However, current teleoperation solutions for high degree-of-actuation (DoA), multi-fingered robots are generally cost-prohibitive, while low-cost offerings usually provide reduced degrees of control. Herein, a low-cost, vision based teleoperation system, DexPilot, was developed that allows for complete control over the full 23 DoA robotic system by merely observing the bare human hand. DexPilot enables operators to carry out a variety of complex manipulation tasks that go beyond simple pick-and-place operations. This allows for collection of high dimensional, multi-modality, state-action data that can be leveraged in the future to learn sensorimotor policies for challenging manipulation tasks. The system performance was measured through speed and reliability metrics across two human demonstrators on a variety of tasks. The videos of the experiments can be found at https://sites.google.com/view/dex-pilot.

Citations (171)

View on Semantic Scholar

Summary

The paper presents DexPilot, achieving glove-free teleoperation using four Intel RealSense cameras and deep learning for robust hand tracking.
The system employs a novel kinematic retargeting method that maps human hand movements to a 23 DoA robotic platform via an optimized cost function.
Experimental evaluations on fifteen tasks demonstrate high precision and reliability, highlighting its potential in remote operations such as surgical and rescue applications.

DexPilot: Vision-Based Teleoperation of Dexterous Robotic Hand-Arm System

The paper discusses the development of DexPilot, a vision-based teleoperation system designed to enable precise control of a complex, high degree-of-actuation (DoA) robotic hand-arm system. The key innovation lies in the system's low-cost implementation, which obviates the need for expensive, sensor-laden gloves, instead relying solely on visual input to track and mimic human hand movements in manipulating a 23 DoA robotic system.

Key Contributions and System Architecture

DexPilot represents a significant advancement in robotic teleoperation, addressing cost and complexity barriers traditionally associated with dexterous robotic systems. The primary contributions of the paper involve:

Markerless, Glove-Free Tracking: The system employs a fully vision-based approach, utilizing four Intel RealSense depth cameras to observe the human hand. This eliminates the need for gloves or markers, thereby reducing setup complexity and interference with the natural hand motion.
Kinematic Retargeting: A novel aspect of DexPilot is its kinematic retargeting system that optimally maps human hand joint movements to the Allegro robotic hand. This is achieved through a carefully designed cost function that takes into account both distance and directional metrics between human and robotic fingertips, ensuring precise manipulation capabilities.
Robust Real-Time Tracking: An innovative combination of model-based and model-free tracking, supported by deep learning architectures, allows the system to reliably track the human hand over extended durations without failures. This approach leverages the strengths of both neural-network-derived predictions and the optimization-based tracking of the DART system.
Riemannian Motion Policies (RMPs): The system integrates RMPs for motion generation, which govern the behavior of the robotic arm, ensuring collision avoidance and maintaining the coherence of human-robot motion translation.

Experimental Evaluation

The system was rigorously tested across a suite of fifteen tasks designed to challenge various aspects of dexterous manipulation, such as in-hand manipulation, precision grasps, and multi-stage operations. Tasks ranged from straightforward pick-and-place operations to complex manipulations like extracting paper currency from a wallet and conducting peg-in-hole insertions. Trials demonstrated the system's ability to successfully complete these tasks, highlighting its reliability and precision despite lacking direct tactile feedback.

Strong Numerical Results and System Performance

Quantitatively, DexPilot showcases impressive success rates across the multi-task benchmark, demonstrating the system's capability in executing diverse tasks reliably. The completion time varied with task complexity, providing an empirical basis to assess task difficulty and the system's adaptive performance across different tasks and pilots.

Implications and Future Directions

The introduction of a low-cost, accurate, and robust teleoperation system like DexPilot has multiple implications:

Practical Applications: DexPilot's ability to conduct complex, precision-based teleoperation without tactile feedback opens up numerous possibilities in fields requiring remote manipulation, such as space exploration, search and rescue, and surgical teleoperation.
Theoretical Developments: The success of DexPilot underscores the potential of combining vision-based tracking with machine learning-driven optimizations in enhancing robotic dexterity. It invites further exploration into the application of deep learning in improving human-robot interaction.
Future Research: There is potential for integrating additional sensory feedback, such as haptics, to enhance the teleoperation experience. Scaling the system for larger workspaces and improving tracking accuracy with better imaging technologies are promising avenues for future research.

In conclusion, DexPilot exemplifies a practical yet sophisticated approach to robotic teleoperation, effectively leveraging advancements in vision and neural networks to facilitate complex robotic manipulations. The research offers a framework for the future development of teleoperation systems that are both economically viable and technically advanced, promising vast improvements in how human operators can remotely interact with robotic systems.

PDF Markdown