GELLO-Inspired Joint-Level Teacher Arm
- The paper demonstrates a low-cost, GELLO-inspired teacher arm that replicates robot kinematics with precise joint-level isomorphism for accurate imitation learning.
- It employs 3D-printed modules and back-drivable servomotors with integrated sensing and optional force feedback to capture high-fidelity human demonstration data.
- Extensive calibration, closed-loop control, and user studies validate its performance in contact-rich tasks and efficient teleoperated robot manipulation.
A GELLO-inspired joint-level teacher arm is a low-cost, kinematically isomorphic teleoperation device designed for collecting high-fidelity, large-scale human demonstration data for robot manipulation via imitation learning. The GELLO framework prescribes a 3D-printed, scaled model of the target robot manipulator, where each joint's kinematic configuration closely matches the corresponding degree of freedom on the target arm. These devices enable intuitive, direct manipulation by human operators and serve as the "leader" (teacher) in leader-follower teleoperation systems. Recent extensions further introduce force feedback, enhancing user interaction, supporting contact-rich tasks, and enriching the demonstration data for learning-based policy transfer (Wu et al., 2023, Sujit et al., 18 Jul 2025).
1. Kinematic Design and Isomorphism
The core principle of GELLO-inspired teacher arms is strict kinematic isomorphism to the target robot. Each teacher device replicates the number of joints (), joint axes, and Denavit–Hartenberg (DH) parameters (up to a uniform scaling factor for link lengths, typically ). The forward kinematics of the teacher arm are
where . The joint angles map directly (), and kinematic chains differ only by spatial scale. All velocities and torques can consequently be mapped in joint space with identity or diagonal scaling factors. The end-effector velocity transforms via corresponding Jacobians:
allowing seamless leader-follower teleoperation (Wu et al., 2023, Sujit et al., 18 Jul 2025).
2. Mechanical and Electrical Implementation
GELLO systems employ widely available, back-drivable servomotors—Dynamixel XL-330 (for passive, position-control only) or XM540-W210-R (for active force-reflection)—mounted in 3D-printed, PLA or PETG joint modules. Gravity compensation is passively provided via springs or rubber bands at specific joints. Bill of materials for a typical 6- or 7-DoF device is approximately \$260–300 with component breakdown as follows:
| Component | Specification | Per-Arm Cost Estimate |
|---|---|---|
| Servo actuation | DYNAMIXEL (6–7 × \$30) | \$180–210 | |
| 3D-printed brackets | PLA/PETG, open-source geometry | \$20 |
| Electronics | Microcontroller, USB/RS-485 adapter | \$40 |
| Bearings/hardware | Steel rods, couplers, fasteners | \$10 |
| Power supply/wiring | 12–24V, 3–6A | \$10–20 |
All mechanical transmission is direct-coupled (no gearing), relying on the servomotors' back-drivability for smooth, fatigue-free user interaction. Actuators provide integral sensing via 12-bit encoders (≈0.088°) and, in force-reflection variants, enable direct current control for torque output. Assembly requires 3D printing of link/bracket designs, mechanical mounting, daisy-chain wiring, and base registration (Wu et al., 2023, Sujit et al., 18 Jul 2025).
3. Sensing, Mapping, and Control Architecture
Each joint actuator provides continuous telemetry of position (and, optionally, velocity via finite differencing) at Hz. Joint states are streamed via USB (teacher device) and Ethernet/ROS (robot side) to the host computer. For position-control variants, the teacher arm is human-backdriven; no teacher-side actuation is necessary. The follower robot executes joint position commands:
In cases where torque control is available, a PD law enhances fidelity:
0
with suggested initial gains of 1–2 Nm/rad, 3–4 Nm·s/rad.
With force-reflection ("GELLO+F") configurations, external joint torques estimated from the follower (e.g., from serial elastic actuation in Franka Panda via FCI) are scaled and reflected to the teacher via
5
using Dynamixel's current control interface (6) (Sujit et al., 18 Jul 2025). The closed-loop system implements safety mechanisms (joint limit guards, current clamps, hardware E-stop, communication heartbeat monitoring).
4. Calibration, Registration, and Data Flow
Initial calibration ensures rigorous spatial alignment:
- Joint-zero alignment: Both teacher and follower bases are physically co-located, and mechanical zeros (7) are set.
- End-effector calibration: A set of 8 joint configurations is collected; a Procrustes/SVD alignment minimizes
9
aligning the pose spaces.
- Joint-space registration: Beyond sign and zero-offset normalization, the mapping is direct by design (0).
During teleoperation, data—including RGB images (from two RealSense D435i cameras), joint positions, velocities, and, in GELLO+F, joint torques—are logged at 50–100 Hz. These data sets are then suitable for high-quality imitation learning (Wu et al., 2023, Sujit et al., 18 Jul 2025).
5. Performance, User Study Results, and Comparative Metrics
Empirical evaluation demonstrates low-latency, accurate tracking: round-trip latency is ≈50 ms; servo positional error is ±0.1°, and robot joint tracking error is ±0.2°. User studies (bi-manual UR5, 12 novice subjects) report:
- Success rates (mean, 5 tasks): GELLO 0.92; SpaceMouse 0.63; VR 0.72.
- Completion times: GELLO is 20–40% faster than VR and 30–60% faster than SpaceMouse.
- Demonstration throughput: GELLO enables twice as many successful demonstrations per hour versus traditional input devices.
- Practice requirement: 5 minutes per user, independent of robotics background.
- Reliability: Fewer self-collisions and timeouts than alternatives.
In force-reflection studies, adding force inputs improved success rates in 3 out of 4 real and simulated manipulation tasks, most notably for drawer opening (0.62→0.93). The addition of force did not significantly increase perceived operator workload (NASA TLX survey, 1 after Bonferroni correction) (Sujit et al., 18 Jul 2025).
6. Imitation Learning Pipeline and Robotics Applications
GELLO-generated data support robot learning via action-chunking transformers and other high-capacity policy architectures. The standard data pipeline records leader/follower joint positions, velocities, estimated torques (GELLO+F), and multi-view images at 50 Hz. The ACT model fuses proprioceptive and visual cues, issuing joint targets one horizon (e.g., 2 timesteps) ahead; CVAE latents regularize diverse behavior. Force-augmented learning improves task generalization, particularly in contact-rich scenarios.
GELLO-style teacher arms have been successfully applied to:
- Complex bimanual tasks (dual-arm setups for UR5, Franka Panda, xArm7)
- Contact-rich tasks (pouring, cable insertion, towel folding, drawer and door opening, whiteboard erasing)
- Open-source, reconfigurable robotics research: mechanical CAD (STEP/STL), ROS node stacks, and electronics designs.
Deployment and reproduction require minimal infrastructure—assembly and calibration can be completed in less than a weekend (Wu et al., 2023, Sujit et al., 18 Jul 2025).
7. Safety, Extensibility, and Limitations
All systems enforce hardware and software safety protocols: emergency stop, joint limit/saturation, communication watchdogs, and maximum current/torque barriers. The GELLO pipeline readily generalizes to new robots by updating the kinematic model and reprinting the mechanical chain. No external force-torque sensor is required if the follower robot provides satisfactory estimates; otherwise, force sensors can be integrated at the end effector.
This suggests that widespread, low-cost, high-fidelity human demonstration collection for imitation learning in robotics is feasible even in resource-constrained settings. A plausible implication is that further advances in force and tactile feedback could further expand the operational envelope, especially for fine manipulation and dexterous control (Wu et al., 2023, Sujit et al., 18 Jul 2025).