A Comprehensive Survey on 3D Content Generation (2402.01166v2)

Published 2 Feb 2024 in cs.CV and cs.AI

Abstract: Recent years have witnessed remarkable advances in artificial intelligence generated content(AIGC), with diverse input modalities, e.g., text, image, video, audio and 3D. The 3D is the most close visual modality to real-world 3D environment and carries enormous knowledge. The 3D content generation shows both academic and practical values while also presenting formidable technical challenges. This review aims to consolidate developments within the burgeoning domain of 3D content generation. Specifically, a new taxonomy is proposed that categorizes existing approaches into three types: 3D native generative methods, 2D prior-based 3D generative methods, and hybrid 3D generative methods. The survey covers approximately 60 papers spanning the major techniques. Besides, we discuss limitations of current 3D content generation techniques, and point out open challenges as well as promising directions for future work. Accompanied with this survey, we have established a project website where the resources on 3D content generation research are provided. The project page is available at https://github.com/hitcslj/Awesome-AIGC-3D.

References (66)

Authors (11)

Jian Liu (404 papers)
Xiaoshui Huang (55 papers)
Tianyu Huang (28 papers)
Lu Chen (245 papers)
Yuenan Hou (31 papers)
Shixiang Tang (48 papers)
Ziwei Liu (368 papers)
Wanli Ouyang (358 papers)
Wangmeng Zuo (279 papers)
Junjun Jiang (97 papers)
Xianming Liu (121 papers)

Citations (15)

View on Semantic Scholar

Summary

A Comprehensive Survey on 3D Content Generation

The paper "A Comprehensive Survey on 3D Content Generation" conducts an in-depth examination of the current landscape of three-dimensional (3D) content generation. It is motivated by the burgeoning interest in 3D Artificial Intelligence Generated Content (AIGC), which presents both significant academic interest and practical applications across various domains such as gaming, entertainment, construction, and industrial design. In this survey, the authors propose a new taxonomy for classifying 3D content generation methodologies into three categories: 3D native generative methods, 2D prior-based 3D generative methods, and hybrid 3D generative methods.

Taxonomy and Techniques

3D Native Generative Methods are approaches that generate 3D content directly with 3D data, though they often grapple with the scarcity of comprehensive 3D datasets. These methods include the generation of objects, scenes, and human avatars, utilizing various representations like point clouds, voxels, meshes, and neural fields. The limitation in this category often arises due to insufficient available 3D data, restricting the vocabulary and richness of objects generated.

2D Prior-based 3D Generative Methods leverage the wealth of existing 2D image and diffusion models for 3D content synthesis. Techniques like DreamFusion use pretrained 2D models to inform the 3D creation process, addressing issues like speed and fidelity. By utilizing multi-view and image-based priors, these methods circumvent the 3D data limitation but face challenges like view consistency and maintaining geometric detail.

Hybrid 3D Generative Methods aim to combine the strengths of both native and prior-based approaches, integrating 3D data with powerful 2D priors. Methods such as Zero123 and its derivatives employ multi-view fine-tuning and large-scale reconstruction models to produce coherent and efficient 3D assets. This category represents a convergence of methodologies, aiming to synthesize the accuracy of 3D data with the creative possibilities of 2D priors.

Key Findings and Numerical Results

The survey covers approximately 60 influential papers, highlighting significant methodologies and development in the sector. Notably, methods integrating 3D Gaussian Splatting have shown substantial improvement in speed (up to 10x faster) compared to those based on NeRF, demonstrating a notable advancement in enabling rapid 3D generative tasks.

Challenges and Future Directions

The paper provides a critical analysis of unresolved challenges such as maintaining high-quality generation, ensuring multi-view consistency, and improving speed without compromising fidelity. From a data perspective, there is an imperative need for larger, more diverse 3D datasets. Model-wise, the advancement of foundational 3D models and architectures tailored to large datasets is a potential avenue for development.

The paper also highlights the importance of establishing robust benchmarks to evaluate 3D content quality, suggesting that automated evaluation metrics should evolve to comprehensively address both geometric and textural fidelity.

Implications and Outlook

The insights offered by this survey lay essential groundwork for further research and development in the field of 3D content generation. As 3D generation techniques evolve, they promise to revolutionize applications across industries by providing innovative and efficient design methods. Furthermore, the integration of future LLMs and multimodal intelligence systems poses an intriguing direction for developing advanced 3D generative frameworks capable of seamless operation within the digital content creation domain.

In conclusion, this paper serves as a pivotal resource charting the trajectory of 3D generative content technologies, providing both an overview of past work and a roadmap for future exploration and application in artificial intelligence and beyond.

PDF Markdown

Related Papers

Find Related Papers

GitHub

GitHub - hitcslj/Awesome-AIGC-3D: A curated list of awesome AIGC 3D papers (684 stars)