Force-Guided Imitation Learning

Updated 16 March 2026

Force-guided imitation learning is defined as methods that integrate force data with position and velocity inputs to achieve robust, adaptive robotic manipulation.
It employs hybrid force–position control architectures and bilateral teleoperation to address challenges in contact-rich tasks such as assembly and delicate interaction.
Experimental results demonstrate improved success rates, enhanced safety, and human-like reflexes, significantly outperforming vision-only approaches.

Force-Guided Imitation Learning for Manipulation

Force-guided imitation learning for manipulation encompasses a class of methods that utilize force/torque (F/T) information, in addition to or in synergy with position and velocity data, to train policies capable of robust, compliant, and adaptive performance on contact-rich manipulation tasks. This paradigm is motivated by the physical reality that many everyday robotic tasks—such as assembly, drawing, wiping, and delicately interacting with sensitive objects—demand active regulation of both position and interaction forces. Force guidance is especially critical for handling variable environments, mitigating uncertainty in contact events, and transferring human dexterity into robotic behavior.

1. Principles of Force-Guided Imitation Learning

Force-guided imitation learning extends standard behavior cloning and inverse-reinforcement learning frameworks by explicitly incorporating force information at data collection, model training, and (in many cases) policy execution. At a high level, the methodology seeks to overcome the limitations of position- or vision-only imitation learning, which often fails in the presence of unmodeled contact, compliance mismatches, or environmental perturbations.

Core principles include:

Sensing and Separation of Forces: Accurate force feedback requires distinguishing between the operator’s action forces and the environment’s reaction forces. Bilateral control architectures, such as 4-channel master-slave setups, are effective for this purpose by enforcing both position synchronization and action–reaction consistency (e.g., $\theta_m-\theta_s=0$ , $\tau_m^{res}+\tau_s^{res}=0$ ) and enabling observers to decompose the measured torques (Adachi et al., 2018).
Hybrid Force–Position Representation: Policies are enriched with force inputs, and often predict actions in a hybrid space (e.g., pose increments, reference forces/torques), which are then realized using impedance, admittance, or hybrid controllers (Adachi et al., 2018, Liu et al., 2024).
Closed-Loop Force Feedback: Real-time adaptation to force signals allows the robot to maintain desired contact conditions, compensate for variable compliance, and recover from disturbances (Ge et al., 21 Sep 2025, Stepputtis et al., 2022).
Demonstration Quality: Haptic feedback to demonstrators during data collection (e.g., immersive gloves, vibrotactile, visual-AR feedback) leads to more consistent, safer, and lower-force demonstration trajectories, which directly improve policy robustness even if the learned policy does not observe force at test time (Li et al., 2023).

2. System Architectures and Control Foundations

A variety of system architectures have been designed to support force-guided imitation learning. The architectural selection depends on the task’s contact richness, the actuation/sensing modalities, and the robot’s control granularity.

Bilateral Control Setups: 4-channel bilateral teleoperation (two identical arms, inner- and outer-loop cross-coupled position and force controllers) enables independent logging of action and reaction forces, essential for learning nuanced human response strategies (Adachi et al., 2018, Sasagawa et al., 2019, Sakaino, 2021).
Impedance and Admittance Control: Force-guided IL frameworks typically use impedance control laws of the form

$\tau_{ctrl} = K_p(x_d-x) + K_d(\dot{x}_d-\dot{x}) + \tau_{ff},$

to realize desired interaction dynamics, or admittance control to modulate reference velocities as a function of observed force error (Ge et al., 21 Sep 2025, Yang et al., 2023, Stepputtis et al., 2022).

Hybrid and Orthogonal Controllers: In tasks requiring distinct regulation along and across the direction of motion (e.g., peeling or assembly), hybrid strategies decouple force and position tracking in orthogonal subspaces, implementing parallel impedance (along trajectory) and admittance (across trajectory) control (Liu et al., 2024).
Sensorless and Multimodal Force Estimation: For platforms lacking F/T sensors, model-based estimators leveraging joint torques and geometric Jacobians infer end-effector interaction forces (Ge et al., 21 Sep 2025, Zhi et al., 27 May 2025), while visuotactile or tactile arrays yield spatially-resolved contact forces (Helmut et al., 15 Oct 2025, Ablett et al., 2023).

3. Learning Algorithms and Policy Representations

Learning in force-guided imitation pipelines combines sequence models (LSTM, Transformer), multisensor fusion, and diffusion or adversarial learning objectives, matched to the signal modalities and supervision available.

Recurrent and Temporal Models: LSTM-based networks consume time series of positions, velocities, and force/torques, predicting future actions in commanded reference space (torque, velocity, position, or force) (Adachi et al., 2018, Sasagawa et al., 2019, Sakaino, 2021).
Multimodal Transformers and Diffusion Policies: Frequency-aware transformers encode asynchronous high-rate force and image streams, fusing them via cross-attention to exploit transient contact signals (Lee et al., 23 Sep 2025). Diffusion models generate consistent action sequences adaptively conditioned on force inputs (Chen et al., 17 Jan 2025).
Explicit Force-Action Spaces: Several frameworks convert force readings into force-conditioned action targets (e.g., $\mathbf{x}_{f,t} = \mathbf{x}_{o,t}+\mathbf{K}_f\mathbf{f}_t$ ), thus policy outputs are regularized to produce contact forces matching demonstrations (Chen et al., 17 Jan 2025, Ablett et al., 2023).
Task-Parameterized and Hierarchical Models: Policies are often structured into high-level (skill/trajectory) and low-level (force adaptation) modules, with the low-level adapted online via RL or probabilistic inference to balance trajectory and force reproduction (Wang et al., 2021, Le et al., 2021).

4. Experimental Methodologies and Benchmarks

Force-guided imitation learning has been validated across a range of contact-rich tasks: assembly, tool use, food serving, drawing, wiping and cleaning, arc tracing, screw insertion, and dexterous in-hand manipulation.

Demonstration Collection Modalities:
- Bilateral teleoperation and haptic master interfaces for action–reaction decoupling (Adachi et al., 2018, Sasagawa et al., 2019)
- Handheld force-moment capture devices for natural demonstration (Liu et al., 2024, Lee et al., 23 Sep 2025)
- Full visuotactile arrays to record dense in-contact data (Helmut et al., 15 Oct 2025, Ablett et al., 2023)
- Wearable feedback gloves and palm interfaces for immersive human demonstration (Li et al., 2023)
Metrics:
- Task success rate in contact-rich evaluation (e.g., >98% for bilateral force policies vs. ≤70% for position-only (Sasagawa et al., 2019))
- Force accuracy and fidelity to human demonstration (e.g., 96% of target reference force (Tsuji et al., 9 May 2025))
- Policy robustness to perturbations—material, geometric, dynamic (Zhi et al., 27 May 2025, Ge et al., 21 Sep 2025)
- Contact consistency, applied force variance (Li et al., 2023)
- Sample complexity and learning curve speedup via force-based reward signal (You et al., 24 Jan 2025)
Ablations/Comparisons:
- Vision-only baselines (often ≤22% success in complex tasks) vs. RGB+force (up to 83% (Lee et al., 23 Sep 2025))
- Policy observation composition: inclusion of force does not degrade, and for high-precision tasks, drastically increases success (+1600% for AirPods opening (Chen et al., 17 Jan 2025))

5. Impact and Analysis of Force Guidance

Incorporating force into imitation learning yields qualitatively and quantitatively superior performance on manipulation tasks requiring dexterous, safe, and adaptive control.

Adaptivity and Generalization: Policies trained with force guidance surpass position- or vision-only counterparts in generalizing to unseen objects, contact conditions, and dynamic disturbances, e.g. maintaining near-100% success across variations in object shape and perturbation (Sasagawa et al., 2019, Tsuji et al., 9 May 2025, Liu et al., 2024).
Safety and Compliance: Force-aware controllers achieve substantially lower contact forces and variances, minimizing damage to both robot and environment, and enabling interaction with fragile objects (Yang et al., 2023, Li et al., 2023).
Human-Like Reflexes: By learning explicit action-force mappings and capturing human response strategies, robots exhibit ability to compensate for contact loss or environmental uncertainty in a human-like manner, including dynamic adaptation to external pushes (Sasagawa et al., 2019, Stepputtis et al., 2022).
Demonstration Quality: Immersive force feedback to demonstrators leads to demonstrations with lower mean force and variance, faster execution, and safer interaction, and these attributes propagate to policies trained on such data—even without force as an explicit input during deployment (Li et al., 2023).

6. Limitations and Future Directions

Despite advances, several limitations and frontiers remain for force-guided imitation learning:

Sensing and Sim2Real: Accurate force estimation without F/T sensors remains challenging; model-based estimators are sensitive to model error, and visuotactile signals often require meticulous calibration. Transfer of force-based skills from simulation to hardware (and between platforms) is not yet robustly solved (Ge et al., 21 Sep 2025, You et al., 24 Jan 2025).
Control Stability: Direct torque command policies (controller cloning) may be unstable in certain configurations; hybrid architectures blending learned command with classical impedance/admittance control are more robust (Adachi et al., 2018, Liu et al., 2024).
Data Collection Burden: High-quality force-annotated demonstrations are more demanding to collect and may require specialized devices or operator training (Lee et al., 23 Sep 2025, Stepputtis et al., 2022).
Skill and Primitive Composition: Most force-guided IL frameworks operate at the level of atomic skills; learning complex, hierarchical activities or selecting among multiple force-conditioning primitives in a data-driven manner is an open direction (Liu et al., 2024).
Multi-finger and Whole-Body Coordination: Extending to high-DOF hands and bimanual or mobile manipulation introduces additional complexity in tactile coverage, force coordination, and control design (Helmut et al., 15 Oct 2025, Stepputtis et al., 2022).
Real-Time Adaptation: Optimization of stiffness, damping, and force-targets for novel tasks or environments remains a key research topic, including the design of policies that can adapt these parameters online (Ge et al., 21 Sep 2025).

7. Synthesis and Outlook

The force-guided imitation learning paradigm—anchored by bilateral control, hybrid action spaces, immersive demonstration, multimodal policy architectures, and compliant control—has established itself as foundational for real-world contact-rich robotic manipulation. Rigorous experimental results across a wide spectrum of manipulation settings have demonstrated that explicit force integration enables high success rates (up to +54.5% over vision-only), substantially improved robustness, and efficient transfer of human dexterity to robotic systems. Continued research in scalable, sensor-agnostic force inference, real-to-sim transfer, joint visuo-haptic/multimodal representation learning, and skill composition is expected to further broaden applicability and autonomy in robots operating in physically complex, unstructured environments (Adachi et al., 2018, Ge et al., 21 Sep 2025, Liu et al., 2024, Lee et al., 23 Sep 2025, Helmut et al., 15 Oct 2025, Chen et al., 17 Jan 2025, Stepputtis et al., 2022).