Contact SLAM: Tactile-Based Blind Manipulation

Updated 18 December 2025

Contact SLAM is a tactile-based SLAM framework using high-resolution contact sensing and geometric priors to precisely estimate gripper, object, and obstacle poses in blind tasks.
The framework employs a factor-graph MAP estimator combined with particle-filtered tactile exploration to reduce pose uncertainty from ~30 mm to below 5 mm within 6–8 steps.
It integrates active exploration and detailed sensor models to achieve millimeter-level localization for contact-rich, visionless manipulation despite assumptions of static environments.

Contact SLAM is a physically-driven simultaneous localization and mapping (SLAM) framework tailored for robots performing contact-rich manipulation where vision is unavailable or occluded (“blind manipulation”). It utilizes high-resolution tactile sensing in conjunction with known object geometries to estimate the state of both the manipulated objects and the surrounding environment. The framework integrates a factor-graph-based maximum a posteriori (MAP) estimator for tactile-only scene and pose inference with an active exploration policy that maximizes information gain, enabling precise and efficient manipulation in fine, contact-dominated blind tasks (Wang et al., 11 Dec 2025).

1. Mathematical Foundations

1.1 State Representation

At each time step $t$ , Contact SLAM maintains estimates for the following state variables:

$g_t \in SE(3)$ : Gripper pose in the world frame.
$\ell_t \in SE(3)$ : Pose of the grasped object.
$e_t$ : Pose of static environment features, such as obstacles or receptacles.
$\theta_{\text{ali}} \in \{0,1\}$ : Binary indicator of task completion.

Contact regions in the scene are abstracted as piecewise-linear polygonal boundaries. For each object or obstacle $j$ :

$S_{l_j} = \{ (n_j, v_j, v_{j+1}) \mid j=1\dots M \}$

where $v_j$ are vertex coordinates in the object frame and $n_j$ the corresponding outward normals.

1.2 Tactile Sensor Model

Each gripper finger is instrumented with a Tac3D sensor, which outputs the net contact force and torque. The measured signal $z_t$ at time $t$ is modeled as

$z_t = h(\ell_t, \ldots) + \eta_t, \quad \eta_t \sim \mathcal{N}(0, \Sigma_z)$

where $h(\cdot)$ is the analytic mapping from the object and gripper poses to predicted force/torque, and $\Sigma_z$ the sensor noise covariance. In the two-finger setting, contact forces $F$ are decomposed in the object frame:

$F = \begin{bmatrix} F_x \ F_y \ F_z \end{bmatrix} ,\quad \begin{cases} F_x = F_y^R - F_y^L \ F_y = F_z^L - F_z^R \ F_z = F_x^L - F_x^R \end{cases}$

The contact point $C$ is computed from:

$(M_{O_e}^L + M_{O_e}^R) = [C]_\times F$

where $[C]_\times$ is the cross-product (skew-symmetric) matrix.

1.3 Transition Models

Gripper motion follows measured robot kinematics: $g_t$ is updated with pose prior noise $\Sigma_g$ .
The grasped object moves approximately rigidly with the gripper, modulo small slip: $\ell_t \simeq T_g^w T_s^g T_l^s(\Delta_t)$ plus process noise.
The environment is assumed static except when contact constraints are triggered: $e_t = e_{t-1} + w_t$ , $w_t \sim \mathcal{N}(0, \Sigma_e)$ .

1.4 Factor Graph MAP Inference

The joint inference is cast as a factor-graph-based MAP optimization:

$\hat x = \arg \min_{x} \frac{1}{2} \sum_k \|F_k(x)\|^2_{\Sigma_k}$

with factors summarizing the above models:

$F_{\text{gri}}$ (gripper prior),
$F_{\text{obj}}$ (object pose constraint),
$F_{\text{env}}$ (environment via contact region boundaries),
$F_{\text{ali}}$ (task completion/alignment).

2. Active Tactile Exploration Policy (ATEP)

ATEP is designed to reduce uncertainty in object-environment pose alignment before commencing fine manipulation. It operates as a particle-filtered, information-driven localizer:

2.1 Particle Representation

A set of $N$ particles $\{p_i^t\}$ captures pose hypotheses, each weighted as $w_i^t$ . Initialization is uniform.

2.2 Local-Peak Detection

Particles with $w_i^t > \tau$ , $\tau = 0.5/N$ , are defined as “peaks”. Convergence is declared if the boundary of these peaks is within $\delta_{\text{thr}}$ .

2.3 Information-Gain Criterion

For each action $a$ and particle $p_i^t$ , the predicted contact and travel distance $(z_{\text{pred},i}^{(a)}, d_i^{(a)})$ are computed. The score for action $a$ is:

$J(a) = \alpha_1 H(Z_a) + \alpha_2 \operatorname{Var}(D_a)$

where $H(\cdot)$ is the entropy of predicted contact types and $\operatorname{Var}(\cdot)$ is travel distance variance. The optimal action maximizes $J(a)$ .

2.4 Particle Update and Control

The selected action is executed until a tactile change is detected. Contact direction determines which region boundaries are updated. The particle set is accordingly restricted and weights are updated:

$w_i^{t+1} = w_i^t \cdot \mathcal{P}(z_{\text{obs}}^t | z_{\text{pred}} (\ldots))$

2.5 Termination

The process terminates when peak count $<N_{\text{thr}}$ and spatial spread $<\delta_{\text{thr}}$ , at which point a final alignment trajectory $\pi$ aims to satisfy $\theta_{\text{ali}} = 1$ .

3. Implementation and Experimental Evaluation

3.1 Robotic Setup

6-DOF robotic arm equipped with a two-finger parallel gripper.
Tac3D tactile sensors on each fingertip, providing 3D force distribution and torque with contact-point localization error $<0.5$ mm.
Manipulated objects modeled from CAD meshes.

3.2 Demonstrated Tasks

Blind Socket Assembly: Insertion of two- and three-pin plugs into matched sockets; no vision is used.
Blind Block-Pushing: T-shaped tool pushes a movable block through obstacles relying strictly on geometric priors and tactile cues.

3.3 Metric Results

Table: Summary of Empirical Results

Task	Final Pose Error	Mean Exploration Steps	Localization Error
Socket (two-pin)	3.775 mm	7.13	$<0.5$ mm (contact pt)
Socket (three-pin)	1.815 mm	7.67	$<0.5$ mm (contact pt)
Block pushing (obstacles)	n/a	3–5 tactile events	$<10$ mm (obstacle)

After $6$–$8$ exploration steps, particle spread ( $\sigma$ ) reduced from approximately $30$ mm to $<5$ mm.

3.4 Sensitivity and Ablation

Task complexity (e.g., plug geometry) affects the number of required exploration steps.
The thresholds $N_{\text{thr}}$ and entropy weights $\alpha_i$ influence the speed and reliability of localization.

4. Analysis: Contributions, Strengths, and Limitations

4.1 Main Contributions

A tactile-based SLAM framework leveraging prior object geometry and physical reasoning, producing localization accuracy to the millimeter level.
Modular factor-graph estimation that supports plug-and-play integration with existing solvers (e.g., GTSAM).
An exploration heuristic balancing entropy (informativeness) and trajectory length for efficient contact localization.

4.2 Advantages

Operation is fully visionless, enabling manipulation in visually occluded scenarios.
Generalizable across multiple task classes, including peg-in-hole and pushing.
Provides real-time localization and force-based inference without reliance on external sensors.

4.3 Limitations

Assumes quasi-static, rigid, geometric contact models; does not account for fast dynamics, compliance, or deformable components.
The environment is static; moving obstacles are not treated.
Particle filter resampling can lead to several millimeters of pose uncertainty, particularly in ambiguous or flat-likelihood regimes.

5. Perspectives and Future Research Directions

Planned research directions include integrating dynamic and force–mass models for rapid manipulation, extending methods to deformable and articulated objects, and the fusion of sparse vision measurements to augment tactile-only SLAM for hybrid active perception scenarios (Wang et al., 11 Dec 2025). This suggests potential applicability in unstructured or partially observable environments common to advanced manufacturing, service robotics, and field deployment, contingent on overcoming limitations related to environment assumptions and contact modeling.

PDF Markdown Chat (Pro)

References (1)

Contact SLAM: An Active Tactile Exploration Policy Based on Physical Reasoning Utilized in Robotic Fine Blind Manipulation Tasks (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Contact SLAM.