Tac2Real: Reliable and GPU Visuotactile Simulation for Online Reinforcement Learning and Zero-Shot Real-World Deployment

Published 30 Mar 2026 in cs.RO | (2603.28475v1)

Abstract: Visuotactile sensors are indispensable for contact-rich robotic manipulation tasks. However, policy learning with tactile feedback in simulation, especially for online reinforcement learning (RL), remains a critical challenge, as it demands a delicate balance between physics fidelity and computational efficiency. To address this challenge, we present Tac2Real, a lightweight visuotactile simulation framework designed to enable efficient online RL training. Tac2Real integrates the Preconditioned Nonlinear Conjugate Gradient Incremental Potential Contact (PNCG-IPC) method with a multi-node, multi-GPU high-throughput parallel simulation architecture, which can generate marker displacement fields at interactive rates. Meanwhile, we propose a systematic approach, TacAlign, to narrow both structured and stochastic sources of domain gap, ensuring a reliable zero-shot sim-to-real transfer. We further evaluate Tac2Real on the contact-rich peg insertion task. The zero-shot transfer results achieve a high success rate in the real-world scenario, verifying the effectiveness and robustness of our framework. The project page is: https://ningyurichard.github.io/tac2real-project-page/

Abstract PDF Upgrade to Chat

Authors (7)

Summary

The paper introduces a PNCG-IPC solver that enhances visuotactile simulation by leveraging GPU-parallelizable operations for efficient RL rollouts.
TacAlign systematically reduces sim-to-real gaps through a four-stage calibration, enabling robust zero-shot policy transfer for contact-rich tasks.
The framework achieves high scalability with 4,465 FPS across 4,096 environments and a 91.7% real-world success rate in peg insertion tasks.

Tac2Real: High-Fidelity Visuotactile Simulation for Online RL and Zero-Shot Real-World Deployment

Introduction and Motivation

Tac2Real addresses the intrinsic challenges associated with simulating vision-based tactile sensors for contact-rich robotic manipulation, particularly focusing on the demands of online reinforcement learning (RL). Traditional penalty-based simulators fail to capture realistic soft elastomer deformation and multi-phase contact dynamics, while physics-based approaches such as material point methods (MPM) suffer from numerical instability under large deformations. Moreover, most simulators do not scale with multi-GPU architectures, severely limiting their throughput and suitability for RL policy training at scale.

Tac2Real integrates the Preconditioned Nonlinear Conjugate Gradient Incremental Potential Contact (PNCG-IPC) method into a highly parallel multi-node, multi-GPU simulation architecture, permitting large-scale RL training with high-fidelity visuotactile signals. The framework includes the TacAlign calibration procedure to systematically reduce both structured and stochastic sim-to-real gaps, enabling robust zero-shot policy transfer.

Figure 1: Tac2Real system architecture integrates parallel tactile simulation and reality gap reduction for reliable policy training and deployment.

Simulation Methodology and Implementation

PNCG-IPC Solver

Tac2Real’s tactile simulation is structured around the PNCG-IPC solver, which combines a log-barrier contact model and nonlinear conjugate gradient optimization. The solver eschews Newton's method and expensive Hessian factorization, instead leveraging GPU-parallelizable operations for gradient and diagonal Hessian computations. The analytic step-size upper bound eliminates continuous collision detection (CCD), drastically accelerating throughput. The trade-off between per-iteration precision and overall convergence is tailored for RL-scale rollouts, ensuring physically consistent deformation without excessive computational overhead.

Tactile Representation and Sensory Modeling

Tac2Real directly outputs marker displacement fields, as opposed to RGB tactile images, based on empirical evidence showing higher sensitivity and clearer differentiation between contact modes for robot manipulation tasks. Simulated marker locations are mapped to gel mesh nodes via k-nearest neighbors and interpolated from deformed elastic states to optimize fidelity and observation dimensionality for policy learning.

Figure 2: Visual comparison of tactile sensor output in various contact modes, highlighting marker displacement field sensitivity.

Multi-Node, Multi-GPU Integration

The tactile simulation backend operates as a plugin to mainstream physics engines (Isaac Lab, MuJoCo). Simulation environments are distributed across Ray-managed clusters, with each GPU hosting multiple tactile tasks. The plugin extracts relevant robot-sensor-object state quantities each step, runs the PNCG-IPC, and returns marker displacements as augmented observations for RL agents.

Figure 3: Tac2Real simulation parallelization, assigning tactile computation tasks across GPU clusters for scalable RL rollouts.

TacAlign: Systematic Sim-to-Real Gap Reduction

TacAlign is a four-stage calibration protocol: (i) trajectory-level controller gain alignment, (ii) baseline IPC material parameter calibration via marker MSE minimization, (iii) task-based fine-tuning for friction/contact stiffness, and (iv) stochastic domain randomization. Unlike naive parameter matching, TacAlign leverages alternating minimization and empirical trajectory discrepancy to synchronize simulation and real-world dynamics.

Figure 4: TacAlign framework schematic, detailing structured and stochastic approaches for sim-to-real gap narrowing.

Quantitative Simulation and Calibration Results

Tac2Real exhibits notable improvements in simulation fidelity and stability over Tacchi and TacSL. Large rotational and slip deformations are stably handled by the PNCG-IPC, eliminating numeric instability and enabling accurate recovery post-contact. Parallel performance benchmarks demonstrate high scalability: Tac2Real achieves 4,465 FPS for 4,096 environments across 16 RTX 4090 GPUs, outperforming Tacchi (MPM-based) and maintaining superior physical accuracy compared to TacSL’s penalty-based approach.

Figure 5: Comparative evaluation showing Tac2Real's accuracy, stability, and parallel efficiency relative to baseline simulators.

Simulation parameter calibration using multiple indenter types and deformation modes reduces MSE between simulated and real marker trajectories. Task-specific friction/contact adjustments further align the tactile signal distribution during critical manipulative events.

Figure 6: Results of simulation parameter calibration, demonstrating close correspondence between simulated and physical elastomer responses.

RL Training and Zero-Shot Deployment Performance

Tac2Real-trained policies are evaluated on peg insertion and nut threading tasks with randomized initial orientations, relying exclusively on tactile displacement fields and end-effector pose for inference. In simulation, Tac2Real and TacSL achieve similar high success rates (≥0.77 for peg insertion), far surpassing Tacchi and non-tactile baselines.

Figure 7: Contact-rich task snapshots and learning curve comparison for Tac2Real against competing tactile simulation backends.

Zero-shot deployment on a real Franka Panda platform yields a success rate of 91.7% for peg insertion, significantly exceeding TacSL (15%) and non-tactile (6.7%) policies. Ablation studies confirm the indispensability of TacAlign’s task-based calibration and randomization for robust transfer. Tacchi-based policies are hampered by numerical artifacts in tactile feedback.

Figure 8: Real-world peg insertion deployment snapshots validating zero-shot transfer performance.

Supplementary Simulation Analyses

Extensive supplementary results for PNCG-IPC include deformation responses across multiple indenter geometries (press, slide, rotate) and detailed comparisons with MPM-based Tacchi under challenging upward/slide contact loss. Tac2Real maintains recovery and fidelity, confirming its suitability for RL-driven contact-rich manipulation.

Figure 9: Visual validation of Tac2Real’s marker displacement fidelity across deformation modes and indenter shapes.

Figure 10: Comparison of Tac2Real and Tacchi under upward and slide deformations, illustrating Tac2Real's numeric stability.

Parameter calibration via CMA-ES shows consistent loss reduction and rapid convergence, ensuring reproducibility and alignment with real-world sensor behavior.

Figure 11: CMA-ES optimization loss curve for simulation parameter calibration.

Additional deployment snapshots reinforce the generalization of Tac2Real-trained policies across diverse peg orientations under blind conditions.

Figure 12: Supplementary real-world deployment results demonstrating policy adaptability under varying peg orientations.

Implications and Future Directions

Tac2Real substantially advances scalable visuotactile simulation for RL-based robotic manipulation, establishing a reliable pipeline for zero-shot policy deployment. By combining physically accurate elastomer modeling, multi-node GPU throughput, and systematic gap calibration, Tac2Real enables practical tactile-driven manipulation across diverse contact-rich tasks.

The implications extend to large-scale tactile policy training, dexterous hand RL for manipulation of soft or deformable objects, and the creation of datasets for vision-language-action (VLA) foundation model training in tactile domains. The methodology can generalize beyond tactile to other sensor modalities, including acoustic and olfactory, with potential for AI-based simulation surrogates to enhance efficiency.

Conclusion

Tac2Real presents a robust visuotactile simulation and calibration framework enabling reliable online RL and zero-shot real-world deployment for contact-rich tasks (2603.28475). Strong empirical performance and systematic sim-to-real alignment position Tac2Real as a scalable foundation for future tactile-driven manipulation research and broader sensor simulation integration.

Markdown Report Issue