DexMachina: Functional Retargeting for Bimanual Dexterous Manipulation (2505.24853v1)

Published 30 May 2025 in cs.RO, cs.AI, and cs.LG

Abstract: We study the problem of functional retargeting: learning dexterous manipulation policies to track object states from human hand-object demonstrations. We focus on long-horizon, bimanual tasks with articulated objects, which is challenging due to large action space, spatiotemporal discontinuities, and embodiment gap between human and robot hands. We propose DexMachina, a novel curriculum-based algorithm: the key idea is to use virtual object controllers with decaying strength: an object is first driven automatically towards its target states, such that the policy can gradually learn to take over under motion and contact guidance. We release a simulation benchmark with a diverse set of tasks and dexterous hands, and show that DexMachina significantly outperforms baseline methods. Our algorithm and benchmark enable a functional comparison for hardware designs, and we present key findings informed by quantitative and qualitative results. With the recent surge in dexterous hand development, we hope this work will provide a useful platform for identifying desirable hardware capabilities and lower the barrier for contributing to future research. Videos and more at https://project-dexmachina.github.io/

Summary

The paper introduces DexMachina, a curriculum-based reinforcement learning method that functionally retargets human demonstrations to enable dexterous robotic hands to perform complex bimanual manipulation.
DexMachina uses decaying virtual object controllers and imitation rewards to guide learning and successfully execute complex bimanual tasks across different robot hands, like manipulating objects mid-air.
This work provides a platform for comparing and benchmarking dexterous robotic hands, aiding in future hardware design and advancing automated manipulation capabilities in industry.

DexMachina: Functional Retargeting for Bimanual Dexterous Manipulation

In the field of robotic manipulation, the quest to achieve human-like dexterity through robotic hands presents substantial challenges, primarily due to the significant embodiment gap between human and robotic hands and the complex spatiotemporal dynamics involved in bimanual tasks. This paper introduces DexMachina, a curriculum-based reinforcement learning (RL) algorithm designed to address these challenges. DexMachina focuses on the functional retargeting of human hand-object demonstration sequences to dexterous robotic hands, allowing robots to learn manipulation policies that can accurately replicate human demonstration trajectories across articulated objects.

Core Contributions

DexMachina distinguishes itself by employing a novel approach: virtual object controllers with decaying strength. Initially, these controllers autonomously guide objects toward their target states following human demonstration trajectories. As the training progresses, the policy gradually learns to assume full control, aided by motion and contact guidance extracted from the demonstration data. This process significantly facilitates exploration within the high-dimensional action space and mitigates early-stage learning pitfalls often experienced with traditional RL methods.

Key results from the DexMachina algorithm indicate a clear performance enhancement over baseline methods, especially in long-horizon manipulations involving complex interactions. For example, DexMachina enables various robotic hand embodiments, including Inspire, Allegro, Xhand, and Schunk, to successfully execute intricate tasks such as manipulating a waffle iron mid-air, a challenging task demonstrating both the complexity of bimanual coordination and the device's adaptability across different robotic configurations.

Methodological Overview

The underlying methodology of DexMachina includes a comprehensive RL environment setup that utilizes state-based inputs for observation spaces. These inputs consist of object states, joint targets, and finger-to-object distances, which play critical roles in shaping the robot's interaction strategies. The method introduces a hybrid action formulation: it restricts wrist motion to closely align with human demonstration trajectories while using absolute actions for finger joints. Moreover, the algorithm incorporates auxiliary rewards for imitation and contact matching to guide the development of task strategies, serving as soft constraints within the curriculum framework.

The curriculum is automatically scheduled based on reward progress. Virtual object controllers initially apply strong guidance and gradually weaken as the policy demonstrates the capability to maintain high task rewards independently. This structured transition allows the robot to retain motion strategies while adapting them to optimize task execution.

Implications and Future Directions

DexMachina sets a precedent for functional comparisons and benchmarking of dexterous robotic hands. By systematically evaluating various hand designs through its algorithm and simulation environments, it paves the way for informed decisions in acquiring and designing new dexterous hands. The implications of this work are far-reaching, extending to practical applications in automated industries where sophisticated manipulation is required.

Looking towards future developments, potential advancements include enhancing the sim-to-real transferability of these learned policies. While current implementations rely on state-based inputs, future research could explore vision-based RL policies or the integration of advanced sensor data. Additionally, with the rapid progress in 3D modeling and reconstruction methods, more efficient and scalable data collection techniques could emerge, minimizing current limitations related to demonstration data acquisition.

By providing an effective platform for identifying desirable hardware capabilities, DexMachina contributes not only to the immediate advancement of dexterous manipulation technologies but also lays foundational work for the evolution of future AI systems capable of performing complex, human-like tasks autonomously.

Related Papers

Find Related Papers

GitHub

Tweets

https://twitter.com/RoboReading/status/1930019334258135129

https://twitter.com/ZhaoMandi/status/1930022183818146209

https://twitter.com/Yue_0124/status/1930534052324626575

YouTube

Show All Videos