GELLO Teleoperation Controller

Updated 27 December 2025

GELLO Controller is a low-cost, open-source teleoperation device that provides intuitive, kinematically isomorphic control for various robotic arms.
It integrates both joint-position control and optional haptic force-feedback, enabling precise demonstrations and improved imitation learning.
The device leverages 3D-printed components and off-the-shelf servos, achieving commercial-level performance at roughly $300 per unit.

The GELLO Controller is a low-cost, kinematically isomorphic teleoperation device enabling intuitive, high-fidelity human control and demonstration data collection for robotic manipulators. It achieves functional parity with advanced commercial master devices while leveraging 3D-printed structures and off-the-shelf actuators. The GELLO platform supports both traditional joint-position control and, in its extended form, haptic force-feedback for enhanced teleoperation and imitation learning tasks. Open-source by design, GELLO targets research settings that require scalable, reproducible, and affordable teleoperation solutions for single- and bi-manual robot arms (Wu et al., 2023, Sujit et al., 18 Jul 2025).

1. Mechanical and Electronic Architecture

Each GELLO controller is constructed as a tabletop, kinematically equivalent replica of a specific target robotic arm (e.g., Franka Panda, UR5, xArm7). The mechanism closely reproduces the target’s joint sequence and frame definition under the Denavit–Hartenberg convention. For the UR5 template, the joint arrangement (with typical DH parameters) and link scaling (0.2–0.3×) ensure 1:1 mapping in user motion to the follower robot.

The core actuation employs six or seven Robotis DYNAMIXEL XL-330R servos (cost ≈ $35/unit), featuring 12-bit absolute encoders and TTL/RS-485 communication. Structural links, housings, and brackets are 3D-printed in PLA/PETG. Passive gravity compensation is achieved using springs or rubber bands on elbow joints to minimize user fatigue and maintain mechanical stability. Calibration involves encoder zeroing in a home configuration, gravity compensation adjustment to ±1°, and encoder count to joint-angle mapping:$$q_{d,i} = 2\pi \cdot (c_i/4096) + o_i $where$ c_i $is the encoder count and$ o_i $is the per-joint offset (<a href="/papers/2309.13037" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">Wu et al., 2023</a>).</p> <p>Total hardware cost per arm is approximately$ 300. The master side supports both position and current control, enabling force-feedback extensions in the augmented system (Sujit et al., 18 Jul 2025).

2. System and Control Pipeline

The canonical GELLO configuration adopts a master–follower teleoperation scheme:

Master (leader): Tabletop GELLO arm (matching DOF and kinematics with the follower), operated by the user.
Follower: Robot arm (e.g., 7-DOF Franka Panda) running a real-time interface (e.g., Franka Control Interface).

The closed-loop pipeline (typ. 50–200 Hz) comprises:

Master joint encoder readings $q_l$ , velocities $\dot{q}_l$ .
Mapping to follower references $q_d = q_l$ , $\dot{q}_d = \dot{q}_l$ (typically a direct mapping as $q_r = q_d$ after offset and ratio calibration).
Follower impedance control:

$\tau_f = K_p (q_d - q) + K_d (\dot{q}_d - \dot{q})$

with $q_{d,i} = 2\pi \cdot (c_i/4096) + o_i$ 0 as diagonal stiffness/damping matrices.

Optionally, the follower estimates environment interaction forces and maps the external wrench $q_{d,i} = 2\pi \cdot (c_i/4096) + o_i$ 1 to joint torques:

$q_{d,i} = 2\pi \cdot (c_i/4096) + o_i$ 2

In the force-feedback extension, $q_{d,i} = 2\pi \cdot (c_i/4096) + o_i$ 3 (where $q_{d,i} = 2\pi \cdot (c_i/4096) + o_i$ 4) is commanded as current to the GELLO motors, providing haptic cueing to the user (Sujit et al., 18 Jul 2025).

Sensor modalities may include built-in torque estimation on the follower (no dedicated 6-axis F/T sensor required), and visual feedback via Intel RealSense RGB-D cameras (overhead and wrist-mounted).

The ROS-based software stack orchestrates device communication, with the gello_driver managing joint feedback and zero offsets, and teleop_mapper forwarding commands between master and follower. Data collection for imitation learning leverages the dataset_recorder node, capturing joint states, images, and optional force signals (Wu et al., 2023).

3. Imitation Learning Integration and Policy Architecture

GELLO is optimized for human-in-the-loop demonstration collection, enabling efficient generation of high-quality datasets for imitation learning (IL). The typical policy architecture for learning from GELLO demonstrations is the Action-Chunking Transformer (ACT) trained as a conditional variational autoencoder (CVAE):

State vector $q_{d,i} = 2\pi \cdot (c_i/4096) + o_i$ 5: Visual embeddings (stereo RGB-D), master joint positions and velocities ( $q_{d,i} = 2\pi \cdot (c_i/4096) + o_i$ 6, $q_{d,i} = 2\pi \cdot (c_i/4096) + o_i$ 7), and follower external torque signals ( $q_{d,i} = 2\pi \cdot (c_i/4096) + o_i$ 8).
Action vector $q_{d,i} = 2\pi \cdot (c_i/4096) + o_i$ 9: Chunk of $where$ 0 future joint-position commands for the master arm.
Loss function:

$where$ 1

with $where$ 2.

Inclusion of force signals ( $where$ 3) in both data collection and IL policy inputs enhances robustness and success rates for force-centric manipulation tasks (Sujit et al., 18 Jul 2025). Demonstrations are recorded in ROS bag format and can be replayed on the physical robot for validation (Wu et al., 2023).

4. Experimental Performance and Comparative Analysis

Quantitative evaluation: Both simulated and real-world manipulation tasks have been benchmarked with GELLO and its force-feedback extension:

Tasks: Nut assembly (sim), door opening (sim), drawer opening (real), whiteboard erasing (real).
Metrics: Success rate over 45 rollouts/task per condition.
Results (mean ± std):

Inputs	Nut	Door	Drawer	Whiteboard
Position only	0.60 ± 0.13	0.96 ± 0.21	0.62 ± 0.48	0.24 ± 0.43
Position+Force	0.42 ± 0.10	1.00 ± 0.00	0.93 ± 0.25	0.36 ± 0.48

Force-augmented control improved success in 3/4 scenarios, most notably drawer opening (where missed grasps could be autonomously corrected). For nut assembly (where spatial precision is dominant) force cues provided no significant advantage (Sujit et al., 18 Jul 2025).

In head-to-head device comparisons, GELLO outperformed VR controllers (e.g., Meta Quest 2) and 3D SpaceMouse devices across representative tasks, achieving an average success rate of 0.92 versus 0.72 (VR) and 0.63 (SpaceMouse), and delivering faster completion times—approaching the lower bound of direct human manipulation (Wu et al., 2023).

5. Human Factors and User Feedback

A user study was conducted with 20 participants (mixed experience; mean age 32.5 ± 7.7), comparing the original and force-feedback GELLO on the erasing task. Protocol included within-subjects design, 2 min of practice per controller, and subsequent NASA-TLX workload questionnaire.

TLX scores (physical demand): 38.2 ± 22.1 (force-feedback) vs. 40.2 ± 19.8 (baseline); no statistically significant differences across subscales post Bonferroni correction ( $where$ 4).
Qualitative results: Novices interpreted force-feedback as “resistance;” experienced users recognized and preferred it, citing more informative haptic cues and increased confidence in execution.

A plausible implication is that the addition of force-feedback enhances usability and preference primarily for robotics-experienced users, while not impacting perceived workload substantially for the broader population (Sujit et al., 18 Jul 2025).

6. Limitations and Future Research Directions

Key limitations include:

Force-feedback performance is constrained by the low continuous torque and lack of dynamic parameter knowledge for the DYNAMIXEL motors, restricting the stable feedback gain ( $where$ 5).
Visual occlusions, such as during close-up manipulations (e.g. erasing), can limit the utility of force cues and degrade IL performance.
Force augmentation offers no clear benefit for visually or kinematically dominated precision tasks.

Highlighted research directions encompass:

Four-channel (4C) bilateral control, contingent on full dynamic parameter identification of the GELLO arm, to realize symmetric action–reaction coupling.
Adaptive force-feedback tuning to match task-dependent contact stiffness.
Expanding IL architectures to leverage combined vision, force, and tactile streams for high-complexity dexterous manipulation (Sujit et al., 18 Jul 2025).

7. Open-Source Ecosystem and Accessibility

The GELLO platform is fully open-source, with comprehensive CAD, firmware, and ROS stacks available under the MIT license:

Project website: https://wuphilipp.github.io/gello/
GitHub: https://github.com/wuphilipp/gello/

Resources include mechanical design files, controller firmware, ROS nodes for teleoperation and data recording, and assembly/calibration guides. Variants are available for Franka, UR5, and xArm manipulators, supporting rapid adoption and extension of the framework in academic and research environments (Wu et al., 2023).

Markdown Report Issue Upgrade to Chat

References (2)

GELLO: A General, Low-Cost, and Intuitive Teleoperation Framework for Robot Manipulators (2023)

Improving Low-Cost Teleoperation: Augmenting GELLO with Force (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to GELLO Controller.