OmniRetarget: Interaction-Preserving Retargeting
- OmniRetarget is an interaction- and contact-preserving mesh-based framework that retargets human loco-manipulation to physically plausible robot motions.
- Its optimized Laplacian mesh deformation enforces kinematic and collision constraints, significantly reducing artifacts like foot-skating and penetration.
- Systematic data augmentation across robot morphologies, objects, and terrains enhances RL policy training for successful long-horizon tasks.
OmniRetarget is an interaction-preserving retargeting and data generation framework designed for transferring human whole-body loco-manipulation and scene-interaction motions to physically plausible, kinematically feasible trajectories for humanoid robots. Distinguished from prior approaches, OmniRetarget reconstructs a spatially and contact-aware “interaction mesh” encapsulating agent, object, and terrain relationships; optimizes Laplacian mesh deformation subject to kinematic constraints; and systematically augments data over robot morphologies, objects, and terrain configurations. Robustness, contact fidelity, and lack of artifacts such as foot-skating or penetration have been quantitatively validated in extensive experimental settings. Its output serves as rich reference data for training reinforcement learning (RL) policies, enabling efficient sim-to-real transfer of long-horizon, agile, and expressive whole-body skills.
1. Motivation and Conceptual Basis
OmniRetarget was developed to address two systemic challenges in humanoid robot skill learning: the embodiment gap—discrepancies between human and robot morphology that lead to kinematically infeasible retargeting artifacts—and the neglect of scene and object contact relationships in conventional motion retargeting. Existing pipelines often rely on unconstrained keypoint mapping from human MoCap data to robot joints, producing physically implausible motions with phenomena such as foot-skating (loss of stationary foot contact) and penetration (violation of collision constraints). These limitations restrict RL policies trained on such data, yielding poor real-world transfer and diminished motion expressivity. OmniRetarget introduces a contact- and interaction-preserving mesh-based paradigm that explicitly models the spatial configuration across agent, objects, and environment, and enforces high-fidelity kinematic constraints in optimization.
2. Interaction Mesh Framework
The pivotal technical construct is the interaction mesh. Each mesh is a volumetric Delaunay tetrahedralization whose nodes include key robot or human joint positions as well as densely sampled vertices from manipulated objects and environmental surfaces. This representation encodes not only the pose of the agent but also the explicit contacts and relative spatial relationships between all interactive elements at each timestep.
By constructing both human and robot interaction meshes for each frame, OmniRetarget facilitates a direct correspondence that can be “warped” for retargeting: stretching, shrinking, or translating mesh regions so that the target robot embodiment, object geometry, and terrain configuration are all explicitly incorporated. This approach preserves local and global relationships critical for maintaining the fundamental characteristics of the original interaction, such as hand placement on an object or foot support on terrain.
3. Laplacian Mesh Deformation and Optimization
OmniRetarget casts retargeting as the minimization of Laplacian mesh deformation energy under strict physical constraints. The Laplacian coordinate of mesh node at time is defined as , where neighbors are indexed by and (uniform weights over the neighboring nodes ). The total deformation energy is the sum over all mesh vertices:
The constrained optimization for each target frame seeks robot configuration minimizing subject to:
- Collision avoidance: for all robot-object/environment pairs, via signed distance functions.
- Joint position and velocity bounds: ; .
- “Foot-sticking” constraints: stance foot positions remain constant across support phases.
- Additional feasible set constraints for complex multi-contact interactions.
This optimization ensures that the robot’s retargeted motion is both a minimal deformation from the original human demonstration and strictly feasible under its morphology, velocity, and collision limits.
4. Data Augmentation over Embodiment, Object, and Terrain
A principal benefit is efficient data augmentation: with a single human demonstration, OmniRetarget generates a wide variety of robot-object-terrain interactions. By systematically modifying the initial mesh configuration (e.g., repositioning objects, varying terrain heights/depths, or applying geometric transformations), and rerunning the optimization, the system outputs new trajectories that retain contact and spatial relationships yet adapt to the altered scenario.
This mesh-based augmentation is fundamentally different from prior approaches limited to target joint coordinate extrapolation or simple domain randomization; it ensures that critical interactions (e.g., maintaining a grasp on a moved object, stepping accurately onto shifted terrain) are preserved throughout.
5. Evaluation and Quantitative Performance
OmniRetarget has been extensively evaluated against standard motion retargeting baselines (PHC, GMR, VideoMimic) using OMOMO, LAFAN1, and an in-house MoCap dataset. Over 8-hour trajectories were generated and assessed for kinematic constraint satisfaction and contact preservation. Key metrics include:
- Near-zero penetration rates.
- Minimal foot-skating across diverse robot models and motions.
- Contact preservation exceeding those of comparison baselines for both robot-object and robot-terrain interactions.
- RL training tasks (robot-object interaction): 82.2% success—substantially outperforming other pipelines.
The robust quality of these retargeted motions enabled proprioceptive RL policies to execute long-horizon (up to 30s) agile behaviors (e.g., parkour with dynamic object manipulation, terrain traversal) on the Unitree G1 humanoid, using only five shared reward terms and simple domain randomization, without the need for task-specific learning curricula.
6. Representative Applications
OmniRetarget supports a spectrum of challenging whole-body loco-manipulation and scene interaction tasks:
- Parkour agility: carrying, climbing, using objects as supports, executing rolls to absorb landings.
- Complex object transportation: manipulating heavy or large items while maintaining continuous contact.
- Terrain adaptation: crawling or navigating uneven surfaces, handling elevation changes and multi-support transitions.
Its generalizable approach to data generation and explicit contact modeling allows both zero-shot sim-to-real policy transfers and effective augmentation of existing datasets for RL training or motion planning.
7. Future Directions and Open Problems
The framework’s sequential frame-based optimization could be extended to joint trajectory optimization for greater robustness to noisy demonstrations (e.g., video tracked human motions), and integrated into visuomotor policies. Further research may incorporate curriculum learning for highly challenging motions and augmentation of the interaction mesh paradigm with advanced perception frameworks. The open-sourcing of code, datasets, and policies provides an infrastructure for reproducibility and future advances in humanoid control and retargeting.
This suggests OmniRetarget is a foundational scheme for interaction- and contact-preserving motion retargeting, contributing essential capabilities for data-driven humanoid robot skill synthesis, RL training, and transfer across varied embodiments, objects, and environments (Yang et al., 30 Sep 2025).