Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 175 tok/s
Gemini 2.5 Pro 52 tok/s Pro
GPT-5 Medium 36 tok/s Pro
GPT-5 High 38 tok/s Pro
GPT-4o 92 tok/s Pro
Kimi K2 218 tok/s Pro
GPT OSS 120B 442 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

DexForce: Force-Informed Imitation Learning

Updated 13 November 2025
  • DexForce is a methodology that integrates tactile feedback with kinesthetic demonstrations to generate force-informed position targets for dexterous manipulation tasks.
  • It employs a two-stage process that extracts force dynamics via six-axis sensors and replays reconstructed trajectories using an inverse impedance controller.
  • Quantitative evaluations show significant performance improvements over non-force approaches in tasks such as sliding cubes, opening AirPods, and unscrewing nuts.

DexForce is a methodology for generating force-informed action labels for imitation learning of dexterous, contact-rich manipulation tasks from kinesthetic demonstrations augmented with tactile sensing. It addresses the fundamental problem that existing demonstration techniques for dexterous robots—specifically retargeted teleoperation without direct haptic feedback—fail to capture the precise contact forces necessary for success in tasks that require high manipulation dexterity. By instrumenting robot fingertips with six-axis force-torque sensors and algorithmically extracting position targets that encode the required contact dynamics, DexForce enables the training of policies that can replicate human-level contact skills in fine-grained manipulation.

1. Motivation and Problem Setting

Dexterous manipulation tasks such as opening an AirPods case or unscrewing a nut critically depend on not only precise finger trajectory generation but also on modulating contact forces at the right time and location. Traditional means of generating demonstrations, such as teleoperation or motion retargeting from human hands, suffer from poor human-to-robot motion correspondence and the absence of haptic feedback, rendering it extremely challenging for demonstrators to impart the necessary force information.

Kinesthetic teaching—where a human guides the robot's fingers directly—affords direct haptic feedback and high correspondence but only records kinematic (fingertip pose) and force data, lacking the direct control actions required for robot-agnostic policy learning. DexForce closes this gap by leveraging fingertip force-torque data to reconstruct, via an "inverse impedance controller," the position targets that, when replayed, induce the measured contact forces. This approach creates a dataset whose supervision signal encodes both positional trajectories and the underlying force requirements.

2. Algorithmic Framework

DexForce operates in a two-stage process:

Stage 1: Extraction of Force-Informed Position Targets

  • Kinesthetic demonstrations are recorded at TT time steps, yielding sequences {xo,1,,xo,T}\{x_{o,1},\ldots,x_{o,T}\} (robot fingertip positions) and {f1,,fT}\{f_1,\ldots,f_T\} (measured 3-axis contact forces) per fingertip.
  • Cartesian impedance control is employed, governed by:

F=Kp(xdxc)Kvx˙cF = K_p(x_d - x_c) - K_v\dot{x}_c

τ=JTF+g\tau = J^T F + g

where FF is the force, xdx_d desired position, xcx_c current position, KpK_p/KvK_v stiffness/damping, JJ the Jacobian, gg gravity compensation, and τ\tau the joint torques.

  • To compute force-informed targets xfx_f that yield the required force ff under quasi-static conditions, a linear mapping is assumed:

xf=xo+Kffx_f = x_o + K_f f

where KfK_f is a diagonal stiffness matrix, hand-tuned to ensure the replayed controller matches observed forces.

Stage 2: Replay and Demonstration Augmentation

  • The reconstructed xf,1:Tx_{f,1:T} trajectories are replayed using the impedance controller.
  • At each time step, new pose (xo)(x_o^*), force/torque (f,m)(f^*, m^*), and a wrist RGB image II are recorded, producing data tuples (It,ft,mt,xf,t)(I_t, f^*_t, m^*_t, x_{f,t}) for imitation learning.

3. Policy Learning and Architecture

The policy learning objective is to train a model πθ\pi_{\theta} that predicts future xfx_f sequences from recent multi-modal sensory inputs:

  • Observations at each time step are ot=[ϕ(It),ft,mt]o_t = [\phi(I_t), f^*_t, m^*_t], where ϕ\phi is a ResNet18 encoder for the RGB image, f,mR3f^*,m^*\in\mathbb{R}^3 are the measured forces/torques per fingertip.
  • The supervised objective minimizes the mean squared error:

L(θ)=t=1Txf,tx^f,t2L(\theta) = \sum_{t=1}^{T} \| x_{f,t} - \hat{x}_{f,t} \|^2

  • The policy is realized as a conditional diffusion model over sequences, structured as:
    • Observation horizon h=2h=2 steps (each with 128 ⁣× ⁣128128\!\times\!128 or 256 ⁣× ⁣256256\!\times\!256 RGB, 6-axis force and moment per fingertip)
    • Prediction horizon Hp=16H_p=16, execution horizon He=8H_e=8
    • Output is a sequence of fingertip 3D position targets for direct impedance control execution.

4. Evaluation: Manipulation Tasks and Quantitative Results

DexForce was evaluated on six dexterous manipulation tasks, each requiring one or two robot fingers and exhibiting high contact richness:

Task Policy Action (Force-informed) Baseline (No force)
Slide cube 90% 0%
Reorient smiley 80% 0%
Open AirPods 57% 0%
Grasp battery 80% 0%
Unscrew nut 80% 0%
Flip box 95% 0%
Average 76% ~0%

These results demonstrate that absent explicit force-informed action targets, imitation learning policies almost never succeed in contact-rich dexterous tasks. Embedding demonstrated forces into policy targets is therefore a strict requirement for high success rates in this domain.

5. Sensory Modalities and Ablation Study

To assess the utility of tactile feedback during policy execution, three observation variants were evaluated:

  • Visual only (RGB)
  • Visual + binary contact (flags set if f>0.55N|f| > 0.55\,\mathrm{N})
  • Visual + full 6-axis force/torque (F/T)

The following findings were established:

  • Including full F/T data never degrades performance and leads to statistically significant improvements in tasks with high precision and coordination requirements:
    • Open AirPods: +1600% success relative to RGB-only
    • Unscrew nut: +136%
    • Slide cube: +58.8%
  • Binary contact indicators are insufficient compared to full F/T data and may underperform pure visual policies.
  • In the absence of real-time force feedback, policies are prone to failure due to incorrect pushing or insufficient gripping.

6. Limitations and Prospects for Development

DexForce’s core limitations and identified future work include:

  • Throughput: Kinesthetic teaching is inherently lower throughput than teleoperation; automating or expediting force-informed data collection is a key challenge.
  • Dexterity scope: Current demonstrations use only a subset (one or two) of robot fingers. Hardware or user interface advances will be required for scalable full-hand demonstrations.
  • Hybrid methods: The potential of integrating teleoperation with haptic feedback systems offers a promising direction to combine high demonstration throughput with the force observability that underpins DexForce.

By algorithmically encoding force dynamics into demonstration actions, DexForce establishes a practical paradigm for teaching robots the nuanced force behaviors essential for complex, contact-rich manipulation in real-world settings.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to DexForce.