Rig-XL is a large-scale dataset of rigged 3D models that provides standardized annotations, extensive category coverage, and benchmarks for automated skeletal rigging.
The dataset features normalized meshes, detailed skeleton trees with per-vertex skinning weights, and efficient tokenization schemes to streamline model processing.
Rig-XL employs rigorous filtering, augmentation, and evaluation protocols, enabling robust training and benchmarking for both academic research and industrial applications.
Rig-XL is a large-scale dataset of rigged 3D models developed to enable and benchmark automated skeletal rigging, specifically supporting the UniRig framework. Comprising 14,611 assets with annotated skeletons and per-vertex skinning weights, Rig-XL offers extensive category coverage for diverse object types, including humanoids, animals, inorganics, and non-standard topologies. Rig-XL addresses challenges encountered in previous datasets by providing standardized normalization, rigorous filtering, semantically rich categorization, and comprehensive annotation protocols, facilitating robust model training and evaluation for both academic and industrial rigging tools (Zhang et al., 16 Apr 2025).
1. Dataset Composition and Scope
Rig-XL consists of 14,611 rigged 3D models, each accompanied by:
Mesh geometry available in .obj, .fbx, and .glTF formats.
A single, connected skeleton tree (joint positions and parent indices).
Per-vertex skinning weights.
Optional bone attributes for physics-based “spring bones” where present.
The dataset was sourced primarily from Objaverse-XL, subject to extensive filtering, augmentation, and manual verification. All models are normalized to fit within a [−1,1]3 unit cube in both geometry and skeleton coordinate space. Skeletons are structurally restricted to trees with 10 to 256 bones and a single connected component to ensure topological suitability for rigging systems.
2. Category Distribution and Topological Diversity
Rig-XL provides explicit coverage across eight semantically defined object categories, enabling evaluation and training on both standard and challenging topologies. Each model is assigned to a single category through automated captioning and classification:
Category
Proportion (%)
Description
Mixamo
≈25
Standard humanoid templates
Biped
≈20
Non-Mixamo two-legged characters
Quadruped
≈15
Four-legged animals
Bird/Flyer
≈10
Avian and flying forms
Insect/Arachnid
≈8
Multi-legged arthropods
Water Creature
≈7
Aquatic organisms
Static Objects
≈5
Inorganic/static (furniture, pillows, etc.)
Other
≈10
Unclassified or miscellaneous
Topological diversity is quantified by bone counts, with a primary mode at 52 bones (reflecting Mixamo full-body rigs) and a secondary mode at 28 bones (mainly Mixamo models lacking hand structures). The minimum and maximum number of bones per asset are 10 and 256, respectively (excluding outliers).
3. Annotation, Tokenization, and Data Representation
Rig-XL advances the annotation and representation of rigged models through several strategies:
Skeleton Tree Tokenization:
Skeletons are encoded into one-dimensional token sequences for autoregressive model training. Discretization bins joint coordinates in [−1,1] to D=256 tokens via
M(x)=⌊(x+1)/2⋅D⌋
and the inverse mapping
M−1(d)=2d/D−1.
Two tokenization schemes are published:
Naïve Sequence: Standard depth-first serialization (average length ≈266.28 tokens/model).
Optimized Tokenization: Incorporates class tokens (e.g., <mixamo>), template chain recognition (Mixamo body and hand structures), spring-bone chain grouping (depth-first search), and branch sorting (descending tail coordinate order). This reduces sequence length to an average of 187.15 tokens/model, a −29.7% reduction.
File and Metadata Conventions:
Meshes are internally converted to uniform-density point clouds for processing (N=65,536 for skeleton, $16,384$ for skinning).
Parent indices are stored 0-based. Root joint index is always 0.
Joint names retain author conventions for maximal compatibility (e.g., “mixamorig:Head”).
Bone connectivity and plausibility are enforced structurally and through filtering of exotic or malformed topologies.
4. Evaluation Metrics and Protocols
Rig-XL supports comprehensive quantitative benchmarking of rigging algorithms across multiple metrics:
Evaluations are conducted under geometric augmentations (random rotations ±30^\circ,scalingin[0.8,1.0],motionperturbations).</li></ul><p>ComparativeperformancebaselinesincludeRigNet,<ahref="https://www.emergentmind.com/topics/normalized−bures−similarity−nbs"title=""rel="nofollow"data−turbo="false"class="assistant−link"x−datax−tooltip.raw="">NBS</a>(NeuralBlend−Shapes),TA−Rig,andcommercialsoftware(Meshy,AnythingWorld,AccuRIG,Tripo).</p><h2class=′paper−heading′id=′preprocessing−filtering−and−quality−control′>5.Preprocessing,Filtering,andQualityControl</h2><p>Rig−XLemploysafive−stagecurationpipeline:</p><ol><li><strong>Skeleton−BasedFiltering:</strong>Retainassetswithasingleconnectedskeletontreeand10–256bones.</li><li><strong>AutomatedDe−duplicationandCategorization:</strong>Useperceptualhashingforduplicateremoval;categoryassignmentusingavision−LLM(ChatGPT−4o).</li><li><strong>ManualVerification:</strong>Visualinspectionwithoverlaidskeletons;roottopologyissuesrepairedbyminimumspanningtreereconnections.</li><li><strong>Training−TimeOutlierRemoval:</strong>Duringmodeltraining,modelswithreconstructionlossgreaterthan10\times$ the average are dynamically excluded.</li>
<li><strong>Normalization & Augmentation:</strong> Point cloud normalization to unit cube, random rotations, scaling, and motion perturbations applied to expand training data diversity.</li>
</ol>
<h2 class='paper-heading' id='design-challenges-and-limitations'>6. Design Challenges and Limitations</h2>
<p>Several data quality challenges are addressed:</p>
<ul>
<li><strong>Data sourcing:</strong> Objaverse-XL contains predominantly static, unrigged models; thus, Rig-XL is constrained to instances where both skeleton and skinning information are present.</li>
<li><strong>Structural anomalies:</strong> Filtering targets disconnected components, missing skinning weights, unconnected bones, and implausible skeletal structures (e.g., root out-degree $>4).</li><li><strong>Annotationprecision:</strong>Manualandalgorithmicinterventionreduceserroneoustopologyandimprovescategorycoherence.</li></ul><p>Aplausibleimplicationisthat,duetorelianceonupstreamdataqualityandrigidfiltering,somepotentiallyvalidexoticrigsmaybeexcludedfromRig−XL.</p><h2class=′paper−heading′id=′methodological−integration−and−use−cases′>7.MethodologicalIntegrationandUseCases</h2><p>Rig−XListhefoundationaldatasetforUniRig,supportingautoregressiveskeletonpredictionandbone−pointcross−attentionskinning.Keyequationsusedduringtraininginclude:</p><ul><li><strong><ahref="https://www.emergentmind.com/topics/next−token−prediction−ntp"title=""rel="nofollow"data−turbo="false"class="assistant−link"x−datax−tooltip.raw="">Next−TokenPrediction</a>Loss:</strong></li></ul><p>\mathcal{L}_{NTP} = -\sum_{t=1}^T \log P(s_t \mid s_{</p><ul><li><strong>Bone–PointCross−AttentionforSkinning:</strong></li></ul><p>\mathcal{F}_W = \mathrm{softmax}\!\left(\frac{Q_W K_W^T}{\sqrt{F}}\right), \quad \mathcal{W} = \mathrm{softmax}\bigl(E_W\left([\mathcal{F}_W, D]\right)\bigr)$
Rig-XL enables systematic ablation and cross-category generalization studies by providing a unified, large-scale, and consistently annotated dataset. This extends its utility beyond UniRig, facilitating rigorous benchmarking of both academic and commercial rigging systems (Zhang et al., 16 Apr 2025).