FSGlove: Advanced Inertial Hand MoCap
- FSGlove is an open-source system using dense inertial sensors and a differentiable calibration pipeline to capture 48 degrees of freedom in hand motion.
- It integrates IMU data with the MANO hand model to accurately estimate joint angles and personalize hand geometry via gradient-based optimization.
- The system demonstrates high precision in joint estimation and hand-object interaction, making it ideal for VR, robotics, and biomechanics research.
FSGlove is an open-source, inertial-based hand motion capture (MoCap) system designed for high-resolution tracking of human hand kinematics and personalized shape reconstruction. It combines dense inertial measurement unit (IMU) placement with a differentiable, shape-aware calibration pipeline (DiffHCal) that integrates with the parametric MANO hand model. The system offers 48 degrees of freedom (DoF) tracking—substantially exceeding the DoF provided by commercial gloves—and achieves state-of-the-art accuracy in both joint estimation and personalized hand geometry. FSGlove’s open hardware and software framework ensures modularity and interoperability with virtual reality (VR), robotics, and biomechanics ecosystems (Li et al., 25 Sep 2025).
1. Hardware Architecture and Signal Pipeline
FSGlove employs a dense sensing topology with 16 HI229 (BNO055) IMUs, providing comprehensive 3D rotational coverage of individual finger joints (three sensors per finger) and dorsum, for a total of 48 rotational DoF. IMUs are dorsally mounted on each proximal, middle, and distal phalanx, as well as one on the hand dorsum for global orientation reference. The IMU sensors provide fused orientation estimates (static error ≈ 0.8°, dynamic error ≈ 2.5°) at 100 Hz, with options up to 400 Hz. Each device features 16-bit orientation, accelerometer, and gyroscope data.
System integration is managed by a Raspberry Pi Zero 2 W single-board computer (quad-core 1.0 GHz, integrated Wi-Fi) interfaced via a custom USB–UART daughter board (dual CH9344, 16 UART over USB at 480 Mbps). Data streams over gRPC and Protocol Buffers to the host PC, with clock synchronization via NTP. Optional dorsal tracking with optical (Nokov) or VR (HTC Vive) systems is realized via IPC. The glove–PC communication delay is ≈24 ms; the total display latency, including GPU rendering at 25 Hz, is ≈40 ms (Li et al., 25 Sep 2025).
| Component | Specification | Role |
|---|---|---|
| IMU | HI229 (BNO055), 16x, 100–400 Hz, 16-bit, 0.8° static error | Joint orientation capture |
| Processing | Raspberry Pi Zero 2 W, custom USB–UART daughter board, 480 Mbps | Aggregation, transmission |
| Data Protocol | gRPC + Protobuf, NTP-synced | Network communication |
| Tracker Integration | OptiTrack, HTC Vive, Nokov, via IPC | Global pose alignment |
2. High-DoF Hand Tracking
Each IMU in the FSGlove independently reports a 3D rotation, yielding 48 trackable Euler-angle DoFs. This design contrasts with commercial gloves such as CyberGlove III or Manus Metaglove Pro, which provide up to 21 DoFs—typically limited to flexion/extension—and do not measure torsion or fine twist. FSGlove’s denser sensor distribution supports measurement of inter-phalange torsions and enables capture of subtle manipulations, such as thumb–index fingertip rubbing.
The system maps sensor outputs directly onto the MANO hand model skeleton. Forward kinematics are computed using a chain of relative SO(3) rotations, eschewing classical Denavit–Hartenberg notation in favor of exponentials of axis-angle parameters, with joint topology and rotation axes structured according to the MANO model (Li et al., 25 Sep 2025).
3. DiffHCal: Differentiable Shape-Aware Calibration
DiffHCal executes a unified, gradient-based calibration estimating global IMU-to-MANO transform (), per-IMU misalignment (), MANO joint pose (), and shape parameters ().
Pose Alignment and Misalignment Correction
Let denote the orientation from each IMU in its West-North-Up frame, and the MANO model link’s orientation. DiffHCal enforces the alignment
across reference poses. Minimizing the pose energy
yields closed-form or fast-gradient solutions in under 10 iterations, producing sub-degree alignment accuracy.
Personalized Shape Estimation
Shape parameter controls a low-dimensional PCA space of hand geometries in MANO. Using a set of contact cues —index pairs of mesh vertices that should coincide (e.g., during pinching)—the shape energy
is minimized through automatic differentiation. 3–5 simple contact poses are sufficient for convergence within 5–8 iterations.
Unified Optimization
The complete optimization jointly minimizes
with gradient-based solvers (Adam, Levenberg–Marquardt), tuning to balance fidelity. Stopping criteria require relative loss changes less than (Li et al., 25 Sep 2025).
4. Noise Handling and Drift Compensation
IMU misalignment is estimated as during calibration. Manufacturer datasheets specify drift under 3°–10° over intervals of tens of minutes; empirical trials of 10–15 minutes showed no significant drift. The system relies on the BNO055’s on-board Madgwick/Mahony filter for orientation fusion. Additional host-side processing applies a third-order Butterworth low-pass filter (approximate cutoff 20 Hz) to joint-angle sequences, mitigating high-frequency sensor noise. Fabric stretch is addressed operationally by advising users to recalibrate daily (Li et al., 25 Sep 2025).
5. Quantitative Evaluation
FSGlove achieves high joint and shape accuracy benchmarks:
- Single-joint accuracy: Employing a two-link 3D-printed rig and validated against Nokov optical MoCap (0.3° angular reference), FSGlove yields a bias of ±2.7°, standard deviation 1.8°, and nonlinearity below 0.7%.
- Shape reconstruction: Using unidirectional Chamfer distance between Photoneo depth-based partial point clouds () and the reconstructed MANO mesh (), the mean mesh error is ≤3.6 mm, comparable to Quest 3 and superior to Manus/VRTRIX.
- Fingertip pinch tracking: Mean thumb–fingertip pinch distance is ≈15.7 mm (Quest 3: 19.6 mm, VRTRIX: 18.9 mm, Manus: 33.2 mm).
- Hand-object interaction: On ContactPose objects, the mean fingertip-to-object point-to-mesh error is ≤20.2 mm (Quest 3: 28.0 mm, VRTRIX: 24.5 mm, Manus: 26.7 mm).
| Task | FSGlove Performance | Comparative Glove Performance |
|---|---|---|
| Joint angle error | ≤2.7° bias, σ = 1.8° | (Better than or comparable to others) |
| Mean mesh error (Chamfer, mm) | ≤3.6 mm | Manus/VRTRIX: higher; Quest 3: similar |
| Pinch mean distance (mm) | 15.7 mm | Quest 3: 19.6, VRTRIX: 18.9, Manus: 33.2 |
| Hand-object error (mm) | ≤20.2 mm | Quest 3: 28.0; VRTRIX: 24.5; Manus: 26.7 |
6. Open-Source Ecosystem and Compatibility
All hardware (CAD, FPC layouts, bill of materials, 3D mounts) is released under the MIT license. Firmware (C++/Linux), low-level drivers, and high-level APIs (Python, C#) support integration with Unity, Unreal, and ROS environments. Example gRPC/Protobuf interfaces to OptiTrack, HTC Vive, and Nokov are implemented, and setup/calibration/integration tutorials are publicly available (Li et al., 25 Sep 2025).
FSGlove’s combination of dense inertial sensing and unified, differentiable calibration achieves industry-leading joint accuracy (<2.7° RMS) and sub-millimeter shape fidelity. The robust, modular, and open design facilitates adoption in VR, robotics, and scientific hand-tracking applications.