ShapeNet: An Information-Rich 3D Model Repository (1512.03012v1)

Published 9 Dec 2015 in cs.GR, cs.AI, cs.CG, cs.CV, and cs.RO

Abstract: We present ShapeNet: a richly-annotated, large-scale repository of shapes represented by 3D CAD models of objects. ShapeNet contains 3D models from a multitude of semantic categories and organizes them under the WordNet taxonomy. It is a collection of datasets providing many semantic annotations for each 3D model such as consistent rigid alignments, parts and bilateral symmetry planes, physical sizes, keywords, as well as other planned annotations. Annotations are made available through a public web-based interface to enable data visualization of object attributes, promote data-driven geometric analysis, and provide a large-scale quantitative benchmark for research in computer graphics and vision. At the time of this technical report, ShapeNet has indexed more than 3,000,000 models, 220,000 models out of which are classified into 3,135 categories (WordNet synsets). In this report we describe the ShapeNet effort as a whole, provide details for all currently available datasets, and summarize future plans.

Authors (13)

Angel X. Chang (58 papers)
Thomas Funkhouser (66 papers)
Leonidas Guibas (177 papers)
Pat Hanrahan (18 papers)
Qixing Huang (78 papers)
Zimo Li (5 papers)
Silvio Savarese (200 papers)
Manolis Savva (64 papers)
Shuran Song (110 papers)
Hao Su (218 papers)
Jianxiong Xiao (14 papers)
Li Yi (111 papers)
Fisher Yu (104 papers)

Citations (5,089)

View on Semantic Scholar

Summary

The paper introduces a large-scale repository of over 3 million 3D CAD models with rich semantic annotations that support diverse research fields.
The paper details a multi-layer annotation process combining geometric, functional, and physical properties for precise and scalable data curation.
The paper demonstrates a hierarchical rigid alignment using a hybrid of algorithmic prediction and human verification to standardize 3D model analysis.

ShapeNet: An Information-Rich 3D Model Repository

Overview

"ShapeNet: An Information-Rich 3D Model Repository" presents a comprehensive framework for creating a large-scale repository of 3D CAD models, filled with detailed annotations. The repository, ShapeNet, organizes these models using the WordNet taxonomy, providing rich semantic data such as rigid alignments, symmetry planes, and part decompositions. The repository comprises over 3,000,000 indexed models, with 220,000 of these classified into 3,135 categories. The repository is aimed at supporting various research fields, including computer graphics, vision, and robotics, by providing structured 3D model data and extensive annotations.

Motivations and Goals

The increasing availability of 3D data, fueled by advanced RGB-D sensors and 3D model repositories, introduces several challenges such as segmentation and establishing correspondences between 3D shapes. Traditional datasets lack the scale and semantic richness needed for robust data-driven methods in computer graphics and vision. ShapeNet is inspired by large datasets like ImageNet and aims to:

Collect and centralize 3D model datasets.
Support data-driven methodologies.
Evaluate and compare geometric algorithms.
Serve as a comprehensive knowledge base of real-world objects.

Annotation Types and Methodologies

ShapeNet differentiates itself by providing multiple layers of annotations:

Language-related Annotations: Models are organized under WordNet synsets, integrating with resources like ImageNet and ConceptNet. Future plans include object descriptions and relation annotations.
Geometric Annotations: Essential attributes include consistent rigid alignments, part decompositions, symmetry planes, and physical object sizes.
Functional Annotations: These describe object parts with functional roles and object affordances, useful for human-robot interaction and robotics.
Physical Annotations: Include surface material properties and weight estimations, imperative for simulations and physical reasoning.

Annotations are acquired through a hybrid approach, combining algorithmic predictions verified by human annotators. This ensures high accuracy and scalability, essential for handling large datasets.

Data Collection and Current Statistics

Data is sourced from public repositories like Trimble 3D Warehouse and Yobi3D, yielding a wide diversity of 3D models. Models are tessellated and indexed, and annotated with geometric and semantic information. As of the report, ShapeNet contains approximately 3,000,000 models, with robust hierarchical alignment.

ShapeNetCore, a subset of 51,300 models, and ShapeNetSem, a subset of 12,000 models with richer annotations, exemplify the repository's breadth.

Hierarchical Rigid Alignment

The alignment process is hierarchical, starting from leaf categories and moving upwards. An algorithmic approach based on Markov Random Fields (MRF) is employed for initial alignment, followed by human verification. This method guarantees that models are consistently oriented across all categories, which is crucial for tasks like shape recognition and classification.

Implications and Future Directions

ShapeNet offers significant implications for both theoretical and practical applications in AI:

Research Facilitation: The repository acts as a backbone for data-driven research, fostering advancements in neural networks for 3D data.
Benchmarking: ShapeNet facilitates the development of standardized benchmarks for evaluating algorithms across various geometric tasks.
Cross-Disciplinary Utility: The repository's extensibility allows it to bridge gaps between multiple research fields, from visual object recognition to robotic manipulation.

Conclusion

ShapeNet's construction is a progressive effort aimed at building a foundational resource for the computer graphics and vision communities. The comprehensive structure and rich set of annotations not only aid current research but also lay the groundwork for future advancements in 3D model analysis and understanding. The repository's continued expansion will introduce additional annotations, deeper correspondences, and integration with real-world RGB-D data, ensuring its relevance and utility in the evolving landscape of AI research.

PDF Markdown