PlasticineLab: Soft-Body Physics Benchmark

Updated 21 November 2025

PlasticineLab is a differentiable physics benchmark that models deformable objects with an elastoplastic MPM, supporting both reinforcement learning and gradient-based optimization.
It offers a diverse suite of 3D manipulation tasks with evaluation metrics like soft IoU and Wasserstein distances to assess realistic performance.
Its CPDeform extension uses optimal transport for contact-point discovery, effectively addressing multi-stage manipulation challenges and local minima issues.

PlasticineLab is a differentiable physics benchmark and simulation environment specifically designed to advance research in soft-body robotic manipulation. It provides a suite of 3D elastoplastic manipulation tasks and exposes a fully differentiable Material Point Method (MPM) simulator, enabling both reinforcement learning and gradient-based trajectory optimization approaches to be directly compared in the context of physically realistic, deformable object manipulation (Huang et al., 2021, Li et al., 2022, Chen et al., 2021). The benchmark’s design, methodology, and extensibility have positioned it as a central resource for investigations into model-based control, dynamic state representation, and algorithmic advances in soft-matter robotics.

1. Architecture and Differentiable Physics

The PlasticineLab environment models deformable objects (e.g., blocks, ropes, shells) as elastoplastic continua using the Moving Least Squares Material Point Method (MLS-MPM). The physics engine is implemented in Taichi and supports reverse-mode automatic differentiation across all simulation components, including elastic and plastic deformations, collisions, and frictional contact (Huang et al., 2021). This fully differentiable pipeline enables direct computation of analytic gradients with respect to action sequences, critical for trajectory optimization and representation learning.

The simulator’s capabilities are defined by the following features:

Elastoplastic continuum modeling: Uses a Neo-Hookean elastic energy density $\psi(F)$ , with plasticity resolved via a von Mises criterion, solved per particle through SVD-based return mapping.
Contact and friction: Rigid bodies are represented as signed-distance fields (SDF); contact strengths are smoothed for differentiability.
Time stepping: Implicit Euler integration for stable simulation of highly deformable materials.
Autodiff support: All particle-grid operations (P2G/G2P), grid solves, plastic flow, and contact are differentiable, exposing gradients for control and learning.

Gradient-based control is realized by unrolling the simulator through time,

$s_{t+1} = \phi(s_t, a_{t+1}), \quad \mathcal{L}(s_T, g) = \mathrm{shape\_difference}(s_T, g),$

and performing backpropagation with respect to the control variables $a_t$ . This approach enables optimization of open-loop control sequences to match target deformations.

2. Manipulation Task Suite and Benchmark Protocol

PlasticineLab’s benchmark suite consists of ten canonical 3D soft-body manipulation tasks, spanning pinching, rolling, carving, and multi-stage assembly. Each task is parameterized by randomized scene variants, manipulators (spheres, spatulas, capsules), and goal configurations (Huang et al., 2021). Notable tasks include:

Move/TripleMove: Block translation via grasping.
Rope: Reshaping a flexible rope around an obstacle.
Torus: Molding a torus shape within a deformable block.
Writer: Carving or drawing specified curves on a plasticine surface.
Pinch: Creating indentations at specified locations.

A key aspect is task diversity, with each scenario varying in control dimensionality, object geometry, and manipulation challenges (e.g., force closure, contact planning). Rewards integrate shape similarity (incremental soft IoU on voxelized densities), signed-distance field alignment, and manipulation regularizers.

Evaluation Metrics

Evaluation is based on:

Incremental soft IoU:

$\text{Normalized incremental IoU} = \frac{\text{IoU}(\rho_T, \rho_g) - \text{IoU}(\rho_0, \rho_g)}{1 - \text{IoU}(\rho_0, \rho_g)},$

where $\rho_T$ is the simulated mass density at the finale, and $\rho_g$ is the goal.

In multi-stage settings, Wasserstein-1 distances between final and target particle sets are used (Li et al., 2022).

3. Control Approaches and Algorithmic Baselines

PlasticineLab enables the comparative paper of different control algorithms under a consistent, physically grounded environment:

Reinforcement Learning (RL) Baselines: Model-free methods (SAC, PPO, TD3) are applied, using fixed observation and reward setups. RL policies generally struggle on tasks requiring long-horizon planning or precision contact establishment due to exploration bottlenecks and local minima (Huang et al., 2021).
Gradient-Based Trajectory Optimization: Analytic gradients from the differentiable simulator are exploited to optimize open-loop action sequences via Adam or SGD. This family achieves rapid convergence and state-of-the-art results on most single-stage tasks, provided suitable contact and action initialization (Huang et al., 2021).
Hybrid Methods: The integration of learned representations or RL policy gradients with analytic physics gradients is identified as an open research direction.

A central limitation uncovered is that vanilla gradient-based solvers encounter severe local minima in multi-stage manipulation, particularly when initial or intermediate contact points are not carefully set. These tasks require the manipulator to switch contacts in sequence, a scenario where standard differentiation fails to "pull" the end-effector out from its current region.

4. Extensions: PlasticineLab-M and Contact-Point Discovery (CPDeform)

To overcome the limitations in multi-stage manipulation, PlasticineLab was recently extended to PlasticineLab-M, which introduces seven new multi-stage manipulation environments, including Airplane, Chair, Bottle, Star, Move++, Rope++, and Writer++ (Li et al., 2022). Each task mandates at least one explicit switch in contact region to achieve staged sub-goals.

The CPDeform method augments the traditional differentiable physics pipeline with contact-point discovery using Optimal Transport (OT). At each stage, OT dual potentials between the current and target particle sets identify "high-priority" regions requiring deformation. Manipulator placement is then heuristically optimized toward these regions (using grid search for single manipulators or a discrete set of poses for multi-manipulator setups), and a local trajectory is optimized using the differentiable simulator for each contact plan.

Algorithmic steps for multi-stage CPDeform:

Compute OT dual potentials $f$ between current and goal.
Select the particle $p$ with maximal $f_p$ (highest transport priority).
Place effector(s) near $p$ ; maximize the placement score:

$\mathrm{score}(x) = \frac{1}{N_p} \sum_{i=1}^{N_p} \frac{f_i}{d(x_i, x)^2 + 1},$

where $d(x_i, x)$ is the signed distance to the manipulator.

Optimize actions using gradient descent over the differentiable physics rollout.
Iterate the procedure for each stage, updating the current shape as input to the next OT computation.

Empirical results demonstrate that CPDeform consistently discovers effective initial and intermediate contacts, largely avoiding local minima and outperforming both vanilla differentiable physics and RL on the new multi-stage tasks (Wasserstein-1 $\sim$ 0.005–0.02 for CPDeform vs 0.02–0.19 for baselines) (Li et al., 2022).

5. Dynamic State Representation via Differentiable Simulation

PlasticineLab has facilitated advances in latent state representation for deformable object manipulation through tightly coupled differentiable physics-based pipelines (Chen et al., 2021). DiffSRL is a state representation learning approach that employs a point cloud encoder (Point Completion Network variant) and a folding-based decoder, followed by a constraint projector and the differentiable MPM simulator to enforce multi-step consistency.

The pipeline is trained end-to-end with composite losses:

Multi-step reconstruction via Earth Mover’s Distance (EMD) across trajectory rollouts.
Constraint loss ensuring decoded states satisfy non-interpenetration and mechanical feasibility.
Optional dynamics consistency losses in the latent space.

Integration with the differentiable simulator ensures the learned latent encodings capture both instantaneous object shape and future soft-body dynamics critical for downstream control. DiffSRL achieves superior trajectory and reward-prediction accuracy, and accelerates RL policy learning for deformable object tasks when compared with autoencoder or contrastive learning baselines (Chen et al., 2021).

6. Implementation, Extensibility, and Usage

PlasticineLab is openly available and provides a consistent API for defining new environments and integrating custom controllers (Huang et al., 2021). Key implementation features include:

Software stack: Python 3.7+, Taichi for simulation and autodiff (CUDA backend), PyTorch, numpy, gym.
Hardware requirements: NVIDIA GPUs with compute capability ≥6.0, 16 GB RAM recommended.
Experiment reproducibility: Pre-built scripts for RL and gradient-based methods; YAML-based scene and task definitions for extensibility.
Simulator internals: Source code details include elastic/plasticity kernels, contact routines, and autodiff pipelines. Gradient computation is enabled through the entire simulation stack.
Benchmarks: Complete with hyperparameter tables, detailed metrics, and visualizations.

This infrastructure supports rapid prototyping and benchmarking of new algorithms, and exposes a testbed for sim-to-real transfer and hybrid learning approaches.

7. Impact and Research Directions

PlasticineLab and its extensions have established the reference protocol for benchmarking soft-body robotic manipulation under differentiable physics. Core results include:

Evidence that model-free RL (SAC, PPO, TD3) is generally ineffective for nontrivial deformable object tasks under sparse rewards and high-dimensional control (Huang et al., 2021).
Demonstration that gradient-based trajectory optimization, while powerful, is fundamentally limited by local minima arising from suboptimal contact initialization, particularly in multi-stage tasks (Li et al., 2022).
Introduction of the optimal transport-based CPDeform controller, which demonstrates that geometric cues from OT dual potentials enable robust and efficient multi-stage manipulation (Li et al., 2022).
The establishment of dynamic state representation learning pipelines, such as DiffSRL, where end-to-end differentiability of simulation grounds learned latent spaces in task-relevant physical behaviors, facilitating more sample-efficient policy optimization (Chen et al., 2021).

Open research challenges include hybridizing RL and differentiable planning, robustification for sim-to-real transfer, generalized contact planning for arbitrary staged objectives, and expanding the scope of differentiable soft-matter simulation frameworks.

Key References:

"PlasticineLab: A Soft-Body Manipulation Benchmark with Differentiable Physics" (Huang et al., 2021)
"Contact Points Discovery for Soft-Body Manipulations with Differentiable Physics" (Li et al., 2022)
"DiffSRL: Learning Dynamical State Representation for Deformable Object Manipulation with Differentiable Simulator" (Chen et al., 2021)