Papers
Topics
Authors
Recent
Search
2000 character limit reached

JoyLo Interface: Whole-Body Teleoperation

Updated 5 February 2026
  • JoyLo is a whole-body teleoperation interface that uses direct joint-to-joint mapping and virtual haptic feedback to enhance robot control in domestic environments.
  • It achieves high precision by delivering >95% replay success and reducing trajectory singularities by 78% compared to conventional VR-based interfaces.
  • Field evaluations show a 5× higher human success rate and 23% faster task completion, underpinning advances in whole-body visuomotor policy learning.

The JoyLo Interface is a cost-effective, high-fidelity whole-body teleoperation system developed as part of the BEHAVIOR Robot Suite (BRS) for real-world household mobile manipulation. JoyLo enables direct, operator-driven collection of high-quality behavioral data on a complex, bimanual, omnidirectional robotic platform, serving as a critical tool for human-in-the-loop demonstration and behavior cloning in whole-body robot policy learning (Jiang et al., 7 Mar 2025).

1. Hardware Architecture and Kinematic Mapping

JoyLo comprises two 3D-printed, kinematic-twin "leader" arms, actuated by low-cost Dynamixel motors (total cost <$500), upon which standard Nintendo Joy-Con controllers are mounted. This design enables a direct joint-to-joint mapping, where each leader joint angle$q_\text{JoyLo}ismappedtoitscorrespondingrobotjointis mapped to its corresponding robot jointq_\text{robot}.Theinterfaceprovidesholisticcontrol:theleftJoy−Conthumbstickcommandsbasevelocity,therightthumbstickmodulatestorsoyawandpitch,andbutton/triggerinputsactuatebimanualparallel−jawgrippers.Thehardwareplatformfacilitatessynchronizedcontrolovera21−DoFrobot(two6−DoFarms,4−DoFtorso,3−DoFbase,andgrippers),deliveringanembodied,full−bodyteleoperationexperiencewithoutrelianceonhigh−costcommercialdevicesorelectromagnetictrackers.</p><h2class=′paper−heading′id=′haptic−feedback−via−spring−damper−coupling′>2.HapticFeedbackviaSpring−DamperCoupling</h2><p>Intheabsenceofforcesensors,JoyLoimplementsbilateralhapticfeedbackthroughavirtualspring−dampercoupling.Thecontrollawisdefinedas:</p><p>. The interface provides holistic control: the left Joy-Con thumbstick commands base velocity, the right thumbstick modulates torso yaw and pitch, and button/trigger inputs actuate bimanual parallel-jaw grippers. The hardware platform facilitates synchronized control over a 21-DoF robot (two 6-DoF arms, 4-DoF torso, 3-DoF base, and grippers), delivering an embodied, full-body teleoperation experience without reliance on high-cost commercial devices or electromagnetic trackers.</p> <h2 class='paper-heading' id='haptic-feedback-via-spring-damper-coupling'>2. Haptic Feedback via Spring-Damper Coupling</h2> <p>In the absence of force sensors, JoyLo implements bilateral haptic feedback through a virtual spring-damper coupling. The control law is defined as:</p> <p>\tau = K_p(q_\text{robot}-q_\text{JoyLo}) + K_d(\dot{q}_\text{robot}-\dot{q}_\text{JoyLo}) - K_{\text{damp}} \dot{q}_\text{JoyLo}</p><p>where</p> <p>where K_pand and K_d$ are stiffness and damping coefficients. This scheme discourages infeasible or unsafe motions, increases operator situational awareness, and provides implicit haptic signaling of contact interactions between the remote robot and its environment. By enforcing differential constraints and penalizing rapid deviations in the operator's input, the interface increases teleoperation safety and fidelity.

3. Data Collection Performance and Quality Metrics

JoyLo operates at a 100 Hz control loop, with demonstrator actions and multimodal observations (RGB-D, fused ego-centric point clouds, odometry, joint states) recorded at 10 Hz. During BRS data collection, the system achieved 561 human demonstration trajectories across five distinct, long-horizon household tasks, with each trajectory ranging from 60–210 seconds in duration and denoting diverse initial object configurations. The interface yielded a >95% replay success rate and reduced the trajectory singularity ratio by 78% versus inverse-kinematics-based teleoperation solutions (VR controllers, Apple Vision Pro), supporting high-quality, "verified" datasets for behavior cloning (Jiang et al., 7 Mar 2025). Trajectory-level success rates indicate substantial improvements over prior platforms in both task reproduction and avoidance of kinematic failures.

4. Comparative Evaluation with Alternative Teleoperation Interfaces

User studies conducted in simulation on the "clean house" task directly compared JoyLo with conventional VR controllers and the Apple Vision Pro interface. Across 10 users, JoyLo yielded 5× higher human success rate, 23% faster completion, a 78% reduction in singularity ratio, and was consistently rated the most user-friendly interface by both novice and expert teleoperators. These findings directly attribute the enhanced performance to the system's direct kinematic mapping and real-time bilateral feedback, in contrast to inverse-kinematics-based, end-effector-driven paradigms.

Interface Human Success Rate Completion Time (rel.) Singularity Ratio (rel.) User Preference
JoyLo 5× higher 23% faster 78% lower Highest
VR Controllers 1× (baseline) Baseline Baseline Lower
Apple Vision Pro Comparable to VR Comparable Comparable Lower

5. Role in Visuomotor Policy Learning

The principal utility of JoyLo lies in its capacity to collect high-fidelity, temporally-aligned human demonstrations necessary for behavior cloning and diffusion-based learning. Data gathered via JoyLo underpins the training of the WB-VIMA (Whole-Body Visuomotor Imitation via Diffusion) architecture, which learns sequential action denoising models over the full 21-DoF action space. The quality and robustness of JoyLo-collected demonstrations are instrumental in achieving high whole-body task success rates—WB-VIMA policies trained on this data produced 13×–21× higher end-to-end success rates compared to diffusion policy baselines that exploit lower-fidelity or alternative interfaces (Jiang et al., 7 Mar 2025).

6. Limitations and Directions for Extension

JoyLo's effectiveness is partially constrained by its embodiment coupling and the specific kinematic configuration of the target robot. Although the interface provides superior data quality for the BRS R1 robot, generalization to other robot morphologies may require hardware reconfiguration or sim-to-real transfer (e.g., multi-embodiment learning, policy pre-training). Current data collection, while high in quality, remains labor-intensive at scale; integrating demonstrative synthesis or leveraging first-person human video may further improve efficiency. A plausible implication is that scalable, low-cost, "leader-follower" interfaces such as JoyLo will remain essential in expanding the repertoire and robustness of whole-body robotic policies, especially for manipulation in unstructured and diverse domestic environments (Jiang et al., 7 Mar 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to JoyLo Interface.