Tex-Parts: Unified 3D Part Annotations

Updated 23 December 2025

Tex-Parts is a large-scale dataset providing densely labeled, semantically consistent 3D part annotations unified from multiple sources like PartNet, 3DCoMPaT++, and Find3D.
It employs an efficient human-in-the-loop annotation pipeline with the ALIGN-Parts engine, achieving a 5–8× speedup in segmentation and naming.
The dataset features a unified canonical ontology with 1,794 unique part names, enabling open-vocabulary, object-conditional segmentation and robust semantic evaluation.

Tex-Parts is a large-scale dataset providing densely labeled, semantically consistent 3D part annotations, derived from the unannotated TexVerse corpus using the ALIGN-Parts engine. Its design enables open-vocabulary, object-conditional 3D part segmentation and naming, generating per-point part masks and affordance-aware part names for thousands of richly annotated 3D models. Tex-Parts addresses inconsistencies in existing datasets' part definitions by merging PartNet, 3DCoMPaT++, and Find3D into a single, unified part ontology of 1,794 canonical names. Annotation is accelerated using a combination of automated proposals and human-in-the-loop verification, delivering a 5–8× speedup over manual methods while maintaining semantic precision (Paul et al., 19 Dec 2025).

1. Motivation and Objectives

Tex-Parts was conceived in response to persistent limitations in 3D part segmentation resources. Available benchmarks such as PartNet, 3DCoMPaT++, and Find3D each employ distinct part taxonomies and granularities, impeding cross-dataset training and transfer. Tex-Parts aims to fill this gap by offering:

A large-scale, open-vocabulary 3D part dataset with labels consistent across object categories and grounded in human-centric affordance descriptions.
Seamless integration of the ALIGN-Parts model to propose both geometric part segments ("partlets") and natural-language part names, supporting open-vocabulary and object-conditional annotation.
Efficient, verifiable annotation: a human annotator validates or corrects a complete object's segmentation and naming in 3–5 minutes versus 15–25 minutes for traditional de novo annotation, yielding a speedup factor of 5–8×.
A publicly released resource with per-point part masks and affordance-aware names, supporting scalable development and evaluation for semantic 3D part segmentation.

2. Unified Canonical Part Taxonomy

A central feature of Tex-Parts is its canonical part ontology, constructed by unifying the part vocabularies of PartNet, 3DCoMPaT++, and Find3D:

Each original part name from these datasets is embedded using an MPNet sentence encoder.
Agglomerative clustering in the embedding space identifies candidate merges (e.g., “microwave” vs. “microwave_oven”).
High-similarity pairs are presented to a Gemini LLM prompt, which discerns true functional equivalence or distinction; equivalent names are collapsed, while distinct names remain separate (e.g., “front_bumper” vs. “rear_bumper”).
An alias table records mappings from each original label (e.g., PartNet’s “bed_footboard”) to the canonical label (“footboard”).

This process results in a unified, flat vocabulary comprising 1,794 unique part names. Each object category draws from its relevant subset (such as “engine” and “wing” for airplanes), ensuring that only semantically appropriate part names are considered during annotation and inference. This global vocabulary enhances label consistency across categories and datasets (Paul et al., 19 Dec 2025).

3. Data Collection and Annotation Workflow

Tex-Parts is sourced from TexVerse, a public corpus of over 850,000 high-quality textured meshes. The annotation pipeline comprises three main stages:

Pre-filtering: Each mesh undergoes automated assessment using shape thumbnails, metadata, and polycount; a Gemini model filters out low-quality or broken assets.
Automated Proposal: Remaining meshes are sampled to 10,000 surface points and processed by ALIGN-Parts, which predicts up to K=32 partlets. For each partlet, the model outputs a soft point mask, a text-embedding prototype aligned to an affordance description, and a dual-source confidence score (combining softmax alignment and Mahalanobis distance).
Human Verification: Shapes are ranked by partlet confidence scores and distributed to annotators in two phases:
- Phase I: Annotators accept or edit proposed partlet masks and names, aided by an autocomplete prompt to facilitate naming.
- Phase II: Annotators supplement any missing parts, if necessary.

By prioritizing high-confidence predictions, the pipeline optimizes annotator efficiency while maintaining segmentation and naming accuracy (Paul et al., 19 Dec 2025).

4. Dataset Composition and Statistics

The current release of Tex-Parts features:

Number of annotated shapes: 8,450
Canonical part vocabulary: 1,794 names
Total unique part categories instantiated: ≈14,000 (reflecting object-conditional subsets and additional Phase II insertions)
Average parts per shape: ≈12 (most in the 3–20 range; few above 28)
Planned train/val/test split: 80/10/10 by shape, with no overlap in object subcategories across folds
Annotation speedup: 5–8× relative to conventional manual mesh segmentation tools

This scale and consistency facilitate transfer learning and robust benchmarking for semantic 3D part segmentation tasks (Paul et al., 19 Dec 2025).

5. File Formats, Metadata, and Licensing

Each Tex-Parts example is distributed with the following components:

mesh.obj, mesh.mtl: The original TexVerse triangle mesh and material definitions.
points.ply: 10,000 sampled surface points, each recording XYZ coordinates, normals, and RGB color.
annotation.json: Contains all segmentation and semantic annotation metadata, including:
- parts: Array of objects, each with:
- part_id: Integer identifier
- name: Canonical part label
- affordance_description: Text description of part function
- mask: Indices of member points
- conf_soft: Softmax-based confidence
- conf_maha: Mahalanobis-based confidence
- global: Object class and confidence
- aliases: Mapping from original tags or groups to canonical part IDs

All associated code is released under the Apache 2.0 license; the dataset itself adopts the TexVerse CC BY 4.0 license (Paul et al., 19 Dec 2025).

6. Evaluation Metrics

Tex-Parts introduces specialized metrics for evaluating named 3D part segmentation, capturing both geometric and semantic correctness. Let $G$ represent the set of ground-truth part masks and $S$ the set of predictions. With $IoU(g,s) = |g \cap s| / |g \cup s|$ and $s^*(g) = \arg\max_{s \in S} IoU(g, s)$ , the following metrics are defined:

Class-agnostic mIoU: $mIoU = (1/|G|) \sum_{g\in G} \max_{s \in S} IoU(g,s)$ (focused solely on geometric overlap).
Label-Aware mIoU (LA-mIoU): $LA\text{-}mIoU = (1/|G|) \sum_{g\in G} \mathbb{1}[\ell(s^*(g)) = \ell(g)] \cdot IoU(g, s^*(g))$ , where $\ell(\cdot)$ denotes the part label index; any label mismatch incurs zero reward.
Relaxed Label-Aware mIoU (rLA-mIoU): $rLA\text{-}mIoU = (1/|G|) \sum_{g\in G} IoU(g, s^*(g)) \cdot \cos(t_{\ell(g)}, t_{\ell(s^*(g))})$ , where $t_i$ is the MPNet embedding of the part's affordance description; partial credit is awarded for semantically similar predictions.

Empirical results on a held-out test set of 206 shapes yield Pearson's $r(mIoU, LA\text{-}mIoU) = 0.739$ , Spearman's $\rho = 0.730$ , and $r(mIoU, rLA\text{-}mIoU) = 0.978$ , $\rho = 0.974$ . The relationships $mIoU \geq rLA\text{-}mIoU \geq LA\text{-}mIoU$ hold by metric design (Paul et al., 19 Dec 2025).

7. Significance and Benchmarking

Tex-Parts establishes a new standard for open-vocabulary, semantically grounded 3D part segmentation. Its unified ontology and annotation methodology support models capable of producing complete, disjoint decompositions into human-readable, affordance-aware part labels. The dataset is anticipated to facilitate community-wide advances in training and evaluating models requiring consistent, scalable, and semantically meaningful 3D segmentation (Paul et al., 19 Dec 2025).

Markdown Report Issue Upgrade to Chat

References (1)

Name That Part: 3D Part Segmentation and Naming (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Tex-Parts Dataset.

Tex-Parts: Unified 3D Part Annotations

1. Motivation and Objectives

2. Unified Canonical Part Taxonomy

3. Data Collection and Annotation Workflow

4. Dataset Composition and Statistics

5. File Formats, Metadata, and Licensing

6. Evaluation Metrics

7. Significance and Benchmarking

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Tex-Parts: Unified 3D Part Annotations

1. Motivation and Objectives

2. Unified Canonical Part Taxonomy

3. Data Collection and Annotation Workflow

4. Dataset Composition and Statistics

5. File Formats, Metadata, and Licensing

6. Evaluation Metrics

7. Significance and Benchmarking

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research