DexForce: Force-Informed Imitation Learning
- DexForce is a methodology that integrates tactile feedback with kinesthetic demonstrations to generate force-informed position targets for dexterous manipulation tasks.
- It employs a two-stage process that extracts force dynamics via six-axis sensors and replays reconstructed trajectories using an inverse impedance controller.
- Quantitative evaluations show significant performance improvements over non-force approaches in tasks such as sliding cubes, opening AirPods, and unscrewing nuts.
DexForce is a methodology for generating force-informed action labels for imitation learning of dexterous, contact-rich manipulation tasks from kinesthetic demonstrations augmented with tactile sensing. It addresses the fundamental problem that existing demonstration techniques for dexterous robots—specifically retargeted teleoperation without direct haptic feedback—fail to capture the precise contact forces necessary for success in tasks that require high manipulation dexterity. By instrumenting robot fingertips with six-axis force-torque sensors and algorithmically extracting position targets that encode the required contact dynamics, DexForce enables the training of policies that can replicate human-level contact skills in fine-grained manipulation.
1. Motivation and Problem Setting
Dexterous manipulation tasks such as opening an AirPods case or unscrewing a nut critically depend on not only precise finger trajectory generation but also on modulating contact forces at the right time and location. Traditional means of generating demonstrations, such as teleoperation or motion retargeting from human hands, suffer from poor human-to-robot motion correspondence and the absence of haptic feedback, rendering it extremely challenging for demonstrators to impart the necessary force information.
Kinesthetic teaching—where a human guides the robot's fingers directly—affords direct haptic feedback and high correspondence but only records kinematic (fingertip pose) and force data, lacking the direct control actions required for robot-agnostic policy learning. DexForce closes this gap by leveraging fingertip force-torque data to reconstruct, via an "inverse impedance controller," the position targets that, when replayed, induce the measured contact forces. This approach creates a dataset whose supervision signal encodes both positional trajectories and the underlying force requirements.
2. Algorithmic Framework
DexForce operates in a two-stage process:
Stage 1: Extraction of Force-Informed Position Targets
- Kinesthetic demonstrations are recorded at time steps, yielding sequences (robot fingertip positions) and (measured 3-axis contact forces) per fingertip.
- Cartesian impedance control is employed, governed by:
where is the force, desired position, current position, / stiffness/damping, the Jacobian, gravity compensation, and the joint torques.
- To compute force-informed targets that yield the required force under quasi-static conditions, a linear mapping is assumed:
where is a diagonal stiffness matrix, hand-tuned to ensure the replayed controller matches observed forces.
Stage 2: Replay and Demonstration Augmentation
- The reconstructed trajectories are replayed using the impedance controller.
- At each time step, new pose , force/torque , and a wrist RGB image are recorded, producing data tuples for imitation learning.
3. Policy Learning and Architecture
The policy learning objective is to train a model that predicts future sequences from recent multi-modal sensory inputs:
- Observations at each time step are , where is a ResNet18 encoder for the RGB image, are the measured forces/torques per fingertip.
- The supervised objective minimizes the mean squared error:
- The policy is realized as a conditional diffusion model over sequences, structured as:
- Observation horizon steps (each with or RGB, 6-axis force and moment per fingertip)
- Prediction horizon , execution horizon
- Output is a sequence of fingertip 3D position targets for direct impedance control execution.
4. Evaluation: Manipulation Tasks and Quantitative Results
DexForce was evaluated on six dexterous manipulation tasks, each requiring one or two robot fingers and exhibiting high contact richness:
| Task | Policy Action (Force-informed) | Baseline (No force) |
|---|---|---|
| Slide cube | 90% | 0% |
| Reorient smiley | 80% | 0% |
| Open AirPods | 57% | 0% |
| Grasp battery | 80% | 0% |
| Unscrew nut | 80% | 0% |
| Flip box | 95% | 0% |
| Average | 76% | ~0% |
These results demonstrate that absent explicit force-informed action targets, imitation learning policies almost never succeed in contact-rich dexterous tasks. Embedding demonstrated forces into policy targets is therefore a strict requirement for high success rates in this domain.
5. Sensory Modalities and Ablation Study
To assess the utility of tactile feedback during policy execution, three observation variants were evaluated:
- Visual only (RGB)
- Visual + binary contact (flags set if )
- Visual + full 6-axis force/torque (F/T)
The following findings were established:
- Including full F/T data never degrades performance and leads to statistically significant improvements in tasks with high precision and coordination requirements:
- Open AirPods: +1600% success relative to RGB-only
- Unscrew nut: +136%
- Slide cube: +58.8%
- Binary contact indicators are insufficient compared to full F/T data and may underperform pure visual policies.
- In the absence of real-time force feedback, policies are prone to failure due to incorrect pushing or insufficient gripping.
6. Limitations and Prospects for Development
DexForce’s core limitations and identified future work include:
- Throughput: Kinesthetic teaching is inherently lower throughput than teleoperation; automating or expediting force-informed data collection is a key challenge.
- Dexterity scope: Current demonstrations use only a subset (one or two) of robot fingers. Hardware or user interface advances will be required for scalable full-hand demonstrations.
- Hybrid methods: The potential of integrating teleoperation with haptic feedback systems offers a promising direction to combine high demonstration throughput with the force observability that underpins DexForce.
By algorithmically encoding force dynamics into demonstration actions, DexForce establishes a practical paradigm for teaching robots the nuanced force behaviors essential for complex, contact-rich manipulation in real-world settings.