Papers
Topics
Authors
Recent
2000 character limit reached

Harmanoid: Dual-Humanoid Motion Imitation

Updated 18 December 2025
  • Harmanoid is a dual-humanoid motion imitation framework that jointly models inter-agent contacts and interaction dynamics to ensure coordinated whole-body interaction.
  • It leverages contact-aware motion retargeting and a curriculum-based reinforcement learning controller to optimize kinematic tracking and maintain physical plausibility.
  • Empirical evaluations show a 25% increase in retargeting success rates and superior performance over single-humanoid approaches in tasks like handshakes and dancing.

Harmanoid refers to a dual-humanoid motion imitation framework designed to resolve the fundamental isolation issue present in prior humanoid imitation methods, enabling physically grounded, socially meaningful whole-body interaction between two humanoid robots. Harmanoid achieves tight inter-agent coordination by jointly modeling contacts and interaction dynamics at every processing stage. It leverages contact-aware motion retargeting combined with a multi-term, curriculum-based reinforcement learning controller to optimize both kinematic tracking and physically plausible inter-agent contact, surpassing single-humanoid frameworks that fail in closely-coupled tasks such as handshakes, dancing, and object carrying (Liu et al., 11 Oct 2025).

1. Motivation and Overview

The isolation issue arises in single-humanoid imitation where retargeting and control processes independently synthesize agents' actions, inevitably losing inter-body contact fidelity. When such controllers operate in proximity (as during collaboration), this deficiency manifests as misaligned or interpenetrated meshes and uncoordinated, unphysical behavior. Harmanoid addresses these shortcomings by explicitly detecting and preserving inter-agent contacts throughout the retargeting and control pipeline, and by introducing interaction-aware policy objectives. Its architecture consists of (i) Contact-Aware Motion Retargeting and (ii) an Interaction-Driven Motion Controller (Liu et al., 11 Oct 2025).

2. Contact-Aware Motion Retargeting

Harmanoid begins from a pair of temporally-aligned human SMPL [Skinned Multi-Person Linear] model trajectories M1:T(i)=(β(i),θ1:T(i),p1:T(i))M_{1:T}^{(i)} = (\beta^{(i)},\theta_{1:T}^{(i)},p_{1:T}^{(i)}), i{1,2}i\in\{1,2\}, where β\beta encodes body shape, θ\theta the joint rotations, and pp the global root translations. Key steps:

  1. Contact Mesh Detection: For each frame tt, per-agent SMPL meshes are scanned for face pairs (fi,fj)(f_i, f_j) such that their pairwise mesh-face distances are within a fixed threshold ϵ\epsilon:

Ct={(fi,fj)fiF(1),fjF(2),dist(fi,fj)ϵ}\mathcal{C}_t = \{(f_i, f_j) \mid f_i \in F^{(1)}, f_j \in F^{(2)}, \mathrm{dist}(f_i, f_j) \leq \epsilon \}

[$9$].

  1. Regularized Shape Optimization: Each robot's target shape β\beta' is optimized to align robot kinematics and maintain mesh-level spatial consistency, via minimization:

Lretarget=Lkeypoint+λβ22\mathcal{L}_{\mathrm{retarget}} = \mathcal{L}_{\mathrm{keypoint}} + \lambda\|\beta'\|_2^2

[$13$].

  1. Contact-Aware Root Pose Optimization: The root poses and orientations are offset to minimize centroidal distance at detected contact face pairs:

(Δproot,Δθroot)=argminΔp,Δθt=1T(fi,fj)Ctv^iv^j22(\Delta p_{\mathrm{root}}, \Delta \theta_{\mathrm{root}}) = \arg\min_{\Delta p, \Delta \theta} \sum_{t=1}^T \sum_{(f_i, f_j) \in \mathcal{C}_t} \|\hat{v}_i - \hat{v}_j\|_2^2

[$15$].

  1. Final Retargeting: Robot-reference trajectories are produced by inverse kinematics subject to joint limits, and per-link contact masks c^t(i){0,1}L\widehat{c}_t^{(i)}\in\{0,1\}^L record expected frame-wise contacts.

This procedure reduces interpenetrations and guarantees that reference trajectories encode physically meaningful, interaction-consistent behavior.

3. Interaction-Driven Motion Controller

The policy πθ\pi_\theta for both robots is trained jointly in a single MDP M=S,A,T,R,γ\mathcal{M} = \langle \mathcal{S}, \mathcal{A}, \mathcal{T}, \mathcal{R}, \gamma \rangle, using Proximal Policy Optimization (PPO). The observation space for each agent at time tt includes:

  • Agent proprioception stprop(i)s_t^{\mathrm{prop}(i)}
  • Time-aligned robot reference motions m^t(i)\widehat{m}_t^{(i)}
  • Partner summary stother(i)s_t^{\mathrm{other}(i)}
  • Both agents' intended contact masks (c^t(i),c^t(i))(\widehat{c}_t^{(i)}, \widehat{c}_t^{(-i)})
  • Measured contact states ct(i)c_t^{(i)}

Core components of the reward function:

  • Interaction Reward (rtintr_t^{\mathrm{int}}): Penalizes discrepancies in relative keypoint offsets between the simulated agents and the retargeted human references, weighted by task importance:

rtint=exp(σintu,vwt(u,v)Et(u,v))r_t^{\mathrm{int}} = \exp \left( - \sigma_{\mathrm{int}} \sum_{u,v} w_t(u,v)\, E_t(u,v) \right )

[$19$].

  • Contact Reward (rtconr_t^{\mathrm{con}}) and Penalty (ptconp_t^{\mathrm{con}}): Enumerated over non-foot robot links, rewarding measured contact forces within reference bounds and penalizing spurious forces:

rtcon=ibBnf(i)rt,b,iexp,ptcon=ibBnf(i)pt,b,iunexpr_t^{\mathrm{con}} = \sum_{i} \sum_{b \in \mathcal{B}^{(i)}_{\mathrm{nf}}} r_{t,b,i}^{\mathrm{exp}},\quad p_t^{\mathrm{con}} = \sum_{i} \sum_{b \in \mathcal{B}^{(i)}_{\mathrm{nf}}} p_{t,b,i}^{\mathrm{unexp}}

[$24$].

  • Curriculum Scheduling: The overall reward is linearly blended (tracking, interaction, contact) according to proficiency-driven weights:

rt=wtrk(t)rtgoal+wint(t)rtint+wcon(t)(rtconptcon)r_t = w_{\mathrm{trk}}(t) r_t^{\mathrm{goal}} + w_{\mathrm{int}}(t) r_t^{\mathrm{int}} + w_{\mathrm{con}}(t) (r_t^{\mathrm{con}} - p_t^{\mathrm{con}})

[$26$]. The weights evolve online using a proficiency score sts_t and update factor αt\alpha_t.

PPO training uses IsaacSim, the Unitree H1 robot model (19 DOF), and 200M simulated environment steps.

4. Empirical Evaluation and Benchmarks

Evaluations are conducted on ≈1M frames of interactive human motion (hugging, dancing, two-person handshakes, carrying) from Inter-X [Xu et al. 2024]. Baselines ExBody [Cheng et al. 2024] and HOVER [He et al. 2025] both fail to preserve stability and physical plausibility in dual settings. Harmanoid demonstrates superior performance across all reported metrics:

Method Succ ↑ EgMPJPEE_{g-\mathrm{MPJPE}} EMPJPEE_{\mathrm{MPJPE}} EaccE_{\mathrm{acc}} EvelE_{\mathrm{vel}}
ExBody 0.00 193.6 mm 91.7 mm 5.41 mm/s² 9.43 mm/s
HOVER 0.20 187.4 mm 94.6 mm 4.74 mm/s² 9.55 mm/s
Harmanoid 0.92 156.8 mm 88.7 mm 3.08 mm/s² 7.55 mm/s

Harmanoid-optimized reference trajectories also show reduced interpenetration and an ≈25% increase in retargeting success rates relative to prior approaches.

5. Ablation, Failure Cases, and Limitations

Ablation studies demonstrate incremental contributions of the interaction reward, contact reward, and curriculum scheduling. Adding interaction terms increases F1 contact-matching and success rates; full curriculum yields best results (Succ = 0.92, F1 = 0.18). Baselines often fail to maintain shoulder-to-shoulder proximity or handshaking pose stability.

Identified challenges:

  • Curriculum synchronization can break under highly dynamic or asynchronous motions (e.g., rapid throws).
  • Joint carrying of heavy objects requires explicit dynamic load-sharing, not currently modeled.

6. Future Directions

Planned extensions include:

  • Force-based dynamic coordination models for collaborative object manipulation.
  • Deployment on physical robots with real-time perception, state broadcast, and sim-to-real domain adaptation mechanisms.
  • Extension to N>2N > 2 agents, requiring generalized pairwise or groupwise interaction objectives.

The Harmanoid framework enables—per the data—the first approach that explicitly bridges kinematic fidelity and physical realism for dual-humanoid interactive motion imitation, with open research on scaling and generalization to broader multi-agent systems (Liu et al., 11 Oct 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Harmanoid.