Thingi10K: A Dataset of 10,000 3D-Printing Models

Published 16 May 2016 in cs.GR | (1605.04797v2)

Abstract: Empirically validating new 3D-printing related algorithms and implementations requires testing data representative of inputs encountered \emph{in the wild}. An ideal benchmarking dataset should not only draw from the same distribution of shapes people print in terms of class (e.g., toys, mechanisms, jewelry), representation type (e.g., triangle soup meshes) and complexity (e.g., number of facets), but should also capture problems and artifacts endemic to 3D printing models (e.g., self-intersections, non-manifoldness). We observe that the contextual and geometric characteristics of 3D printing models differ significantly from those used for computer graphics applications, not to mention standard models (e.g., Stanford bunny, Armadillo, Fertility). We present a new dataset of 10,000 models collected from an online 3D printing model-sharing database. Via analysis of both geometric (e.g., triangle aspect ratios, manifoldness) and contextual (e.g., licenses, tags, classes) characteristics, we demonstrate that this dataset represents a more concise summary of real-world models used for 3D printing compared to existing datasets. To facilitate future research endeavors, we also present an online query interface to select subsets of the dataset according to project-specific characteristics. The complete dataset and per-model statistical data are freely available to the public.

Abstract PDF Upgrade to Chat

Citations (345)

View on Semantic Scholar

Summary

The paper presents Thingi10K, a dataset of 10,000 real-world 3D printing models capturing complex issues like self-intersections and non-manifold structures.
It details extensive geometric and contextual analyses that enable realistic benchmarking against sanitized datasets such as MPZ14 and ShapeNetCore.
The dataset supports robust testing of 3D printing algorithms and advances machine learning applications by representing genuine fabrication challenges.

Thingi10K: A Dataset of 10,000 3D-Printing Models

The paper "Thingi10K: A Dataset of 10,000 3D-Printing Models" by Qingnan Zhou and Alec Jacobson presents a comprehensive dataset designed for empirically validating algorithms and implementations pertinent to 3D printing. Unlike standard models often employed in the domain of computer graphics, the Thingi10K dataset offers extensive diversity and incorporates the geometry complexities and issues prevalent in 3D printing.

Dataset Composition and Characteristics

Thingi10K comprises 10,000 models sourced from Thingiverse, a prominent online repository for 3D-printable items. What distinguishes Thingi10K is its emphasis on real-world geometry that includes inherent complexities such as self-intersections, non-manifoldness, and degenerate triangles. This dataset offers a varied representation based on item class, geometry type, and intrinsic challenges. Notably, such refinement allows for more realistic evaluations of 3D printing algorithms, as it mimics the actual conditions faced during real-world 3D printing processes.

Benchmarking Against Existing Datasets

The researchers undertake a comparative analysis between Thingi10K and existing datasets like MPZ14 and ShapeNetCore. The former is sanitized, compromising real-world applicability, while the latter contains models primarily for visualization, not explicitly prepared for fabrication. Thingi10K fills this gap by illustrating a comprehensive and realistic sample of printed models accompanied by intricate geometric details and contextual data such as licensing and classifications.

Geometric and Contextual Analysis

The paper imparts a detailed assessment of the geometric properties of the Thingi10K models, covering vertex counts, component analysis, genus, mesh quality, and manifoldness. Such breadth of analysis ascertains that the dataset accurately presents the characteristic challenges encountered in 3D printing. The authors demonstrate that high-level measures often overshadow the intricate complexities pertinent to geometric processing in printing technologies.

Furthermore, Thingi10K encompasses extensive contextual metadata, which enhances its usability in specific machine learning and data mining projects. Examples include categories, subcategories, and user-generated tags that provide semantically rich annotations further expanding the utility of this dataset across a range of applications.

Implications and Future Directions

The Thingi10K dataset is pivotal for testing geometry processing algorithms focusing on structural analysis, shape optimization, and solid geometry operations tailored for 3D printing. The dataset, due to its complexity and variety, offers a real-world benchmark for robustness against common geometric problems encountered in industry settings.

The paper hints at future improvements, including dataset expansion to keep abreast of the evolving 3D printing community and to extend the dataset's usability and reliability. Crucially, Thingi10K could serve as ground truth data for training machine learning models that require genuine digital representations of 3D printed objects.

Overall, Thingi10K represents a significant resource for the 3D printing community, advocating for an empirically driven approach to developing processing algorithms. This alignment with real-world data marks a pivotal step towards advancing research methodologies within this field.

Markdown Report Issue