Papers
Topics
Authors
Recent
2000 character limit reached

MultiDex Grasping Dataset

Updated 28 November 2025
  • The dataset provides a large-scale, force-closure validated repository with over 436k dexterous grasps across varied robotic hand types and object configurations.
  • It employs a rigorous, physics-based, contact-centric sampling pipeline with multi-modal annotations, including SE(3) pose, joint angles, and object geometry.
  • Applications include vision-to-grasp mapping, representation learning, and sim-to-real transfer, enabling hand-agnostic evaluation and cross-hand generalization.

The MultiDex Grasping Dataset is a comprehensive, large-scale repository of robotic grasp demonstrations specifically designed to support the development, benchmarking, and generalization of dexterous grasp synthesis algorithms across varied robotic hands. MultiDex encompasses diverse object and hand kinematic configurations and is referenced in multiple foundational works in the grasping literature, notably "GenDexGrasp: Generalizable Dexterous Grasping" (Li et al., 2022) and "Towards a Multi-Embodied Grasping Agent" (Freiberg et al., 31 Oct 2025). Its purpose is to provide standardized, force-closure validated grasp data and rich multi-modal annotations to enable generalizable, hand-agnostic learning methods for robotic grasping.

1. Scope and Dataset Composition

MultiDex aggregates a wide spectrum of grasp samples, hand configurations, and object types, making it a benchmark for generalizable manipulation research.

  • Total Grasp Instances: 436,000 valid dexterous grasps in (Li et al., 2022); 20,000,000 grasps across 25,000 scenes in (Freiberg et al., 31 Oct 2025).
  • Hand and Gripper Diversity:
    • EZGripper (2-finger parallel) — 2 DoF
    • Barrett Hand (3-finger underactuated) — 8 DoF
    • Robotiq-3F, Allegro Hand (3/4-fingered dexterous) — 12/16 DoF
    • Shadow Hand (5-finger, anthropomorphic) — 20–22 DoF
    • Additional: DEX-EE, Franka Emika Panda, ViperX 300s, and others
  • Object Set:
    • 58–1,000+ daily-use household and industrial objects (YCB, ContactDB, Google Scanned Objects)
    • Meshes in .ply/.obj format, scale-normalized for consistent grasping statistics

Table: Hand Types and Dataset Size

Hand Model Fingers DoF # Grasps Objects (min)
EZGripper 2 2 ~85k 58
Barrett 3 8 ~87k 58
Robotiq-3F 3 6-8 ~87k 58
Allegro 4 16 ~87k 58
ShadowHand 5 20-22 ~87k 58

Objects are split into fixed train/test subsets for standardized evaluation (e.g., 48 train, 10 test in (Li et al., 2022)). The more recent (Freiberg et al., 31 Oct 2025) version notably features 25,000 cluttered scenes with up to 7 objects each and up to five grippers per scene.

2. Data Annotation: Modalities and Representation

MultiDex provides multi-modal, structured storage for each grasp instance to facilitate diverse modes of learning and evaluation.

  • Hand Pose: Parameterized by SE(3) end-effector pose and a full configuration vector of joint angles qjointRN\mathbf{q}_{\text{joint}} \in \mathbb{R}^N, with NN determined by hand model.
  • Object Geometry: Triangular mesh per instance; sampled vertex sets and surface normals; contact regions specified.
  • Contact Map Ω\Omega: For each grasp, an object-centric continuous-valued per-vertex contact map is computed:

C(vo,H)=12(σ(D(vo,H))0.5),D(vo,H)=minvhH  eγ(1vovhvovh,no)  vovh2,  γ=1C(v_o, H) = 1 - 2\left(\sigma(D(v_o, H)) - 0.5\right),\quad D(v_o, H) = \min_{v_h \in H}\; e^{\gamma (1 - \langle \frac{v_o-v_h}{\|v_o-v_h\|},\, n_o\rangle )}\; \|v_o - v_h\|_2,\;\gamma=1

This provides a hand-agnostic, surface-attentive signal for learning transferable grasp descriptors.

  • Grasp Quality Metrics: Differentiable force-closure (dfc) scores calculated via

dfc(X)=Gc,\mathrm{dfc}(X) = Gc,

with G=[I3I3;[x1]×[xn]×]G = [I_3\,\ldots\,I_3; [x_1]_\times\,\ldots\,[x_n]_\times] acting on contact normals cc at contact sites XX.

  • Stability Flags: Binary indicator derived from physics testing under 6-axis disturbance in simulation (Isaac Gym with 0.5 m/s² for 1s, success if displacement <2<2 cm).

3. Sampling and Quality Control Methodologies

Grasp synthesis in MultiDex is defined by a rigorous, physics-based, and contact-centric sampling pipeline.

  • Force-closure optimization: MALA (Metropolis-adjusted Langevin Algorithm) sampling minimizes a composite energy function integrating force closure, joint limit, and penetration penalties:

E(H,X;O)=dfc(X)+En(H)+Ep(H,O)E(H, X; O) = \mathrm{dfc}(X) + E_n(H) + E_p(H, O)

with EnE_n enforcing joint bounds and EpE_p penalizing mesh penetration.

  • Physical Validation: Grasps are only recorded if they pass no-collision placement, force-closure, and post-lifting disturbance. No failure samples are retained in the public release.
  • Hand-agnostic sampling: By leveraging contact maps and surface-aligned metrics, learned grasp representations are decoupled from any specific hand’s kinematics, allowing cross-hand and cross-object generalization (Li et al., 2022).
  • Dataset-level Filtering: Scenes/objects with insufficient grasp yield are pruned to maintain uniform data density for learning (Freiberg et al., 31 Oct 2025).

4. Data Organization, Splits, and Access

MultiDex data are organized for high-throughput loading and flexible batching.

  • Directory Structure: Data are split by hand and object: MultiDex/hands/HandName/Object/samples/. Each grasp is a NumPy .npz file containing
    • "q_global": 6D pose,
    • "q_joint": joint angles,
    • "verts", "normals": mesh arrays,
    • "contact_map": per-vertex float,
    • "dfc": scalar score.
  • Unified Scene Format (Freiberg et al., 31 Oct 2025): Each .npz scene contains
    • point_cloud: float32[15,000x3]
    • object_ids: object indices,
    • grasps: float32[#grasps x (7 + D_g)],
    • gripper-specific kinematic YAML.
  • Annotation Example (Python):

1
2
3
4
import numpy as np
data = np.load("MultiDex/hands/ShadowHand/apple/samples/00001.npz")
qg, qj = data["q_global"], data["q_joint"]
verts, cmap = data["verts"], data["contact_map"]

  • Split Files: Standard train/test object lists provided (e.g. train_objects.json, test_objects.json) for statistical benchmarking.

5. Gripper Kinematics and Cross-Hand Generalization

Each robotic hand in MultiDex is fully specified by standard Denavit–Hartenberg or PoE kinematic models, supporting precise forward kinematics for arbitrary SE(3) joint configurations. The fundamental pose calculation uses the product-of-exponentials formulation:

TEE(qg)=i=1Dgexp(ξ^iθi)T0EE,T_{\mathrm{EE}}(\mathbf{q}_g) = \prod_{i=1}^{D_g} \exp(\hat{\xi}_i \theta_i) T_{0\mathrm{EE}},

where θi\theta_i are the joint values and ξ^i\hat{\xi}_i are the joint twists for link ii (Freiberg et al., 31 Oct 2025).

This explicit, modular model of hand kinematics underpins the generalizability of learning algorithms, enabling flow-based and diffusion-based synthesis on the product manifold M=SE(3)×RDg\mathcal{M} = \mathrm{SE}(3) \times \mathbb{R}^{D_g} for handling arbitrary gripper types and degrees of freedom (Freiberg et al., 31 Oct 2025).

6. Benchmarking, Metrics, and Reference Results

MultiDex benchmarking protocols employ standardized experimental setups.

  • Success Rate: Fraction of physically validated grasps meeting displacement and lift criteria, reported per hand–object–split configuration.
  • Diversity: Joint-angle standard deviation on a test set (e.g., σ0.207\sigma \approx 0.207 rad for ShadowHand (Li et al., 2022)).
  • Comparison Table (ShadowHand, test-set, (Li et al., 2022)):
Method Generalizable Success (%) Diversity (rad) Time (s)
dfc [Liu et al.] 79.53 0.344 >1800
GraspCVAE 19.38/22.03* 0.340/0.355 0.012/43.2
UniGrasp 80.0† 0.000 9.33
GenDexGrasp 77.19 0.207 16.42

7. Usage and Applications

MultiDex is designed for reproducible research and benchmarking in dexterous, generalizable grasp synthesis:

  • Vision–to–grasp learning: Train CNNs or point-cloud networks for image-to-grasp mapping using multi-modal annotations.
  • Representation learning: Contact-map and pose embeddings for hand-agnostic grasp reasoning.
  • Algorithm evaluation: Standardized train/test splits enable objective comparison of generalization, diversity, and efficiency.
  • Cross-hand transfer: Contact map intermediates enable hand-to-hand generalization without retraining.
  • Sim-to-real transfer: Physically validated grasps facilitate adaptation to hardware via fine-tuning.

MultiDex and associated codebases for GenDexGrasp and multi-embodiment diffusion/flow-based models are released under open licenses, supporting direct integration into research pipelines (Li et al., 2022, Freiberg et al., 31 Oct 2025).


MultiDex Grasping Dataset thus provides the robotic manipulation research community with a rigorously annotated, physically validated, and hand-diverse resource for benchmarking and advancing generalizable dexterous grasping algorithms. Its force-closure-centric design, object/hand diversity, and hand-agnostic contact representations underpin its continued influence in embodied intelligence research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to MultiDex Grasping Dataset.