A Comprehensive Survey on 3D Content Generation (2402.01166v2)
Abstract: Recent years have witnessed remarkable advances in artificial intelligence generated content(AIGC), with diverse input modalities, e.g., text, image, video, audio and 3D. The 3D is the most close visual modality to real-world 3D environment and carries enormous knowledge. The 3D content generation shows both academic and practical values while also presenting formidable technical challenges. This review aims to consolidate developments within the burgeoning domain of 3D content generation. Specifically, a new taxonomy is proposed that categorizes existing approaches into three types: 3D native generative methods, 2D prior-based 3D generative methods, and hybrid 3D generative methods. The survey covers approximately 60 papers spanning the major techniques. Besides, we discuss limitations of current 3D content generation techniques, and point out open challenges as well as promising directions for future work. Accompanied with this survey, we have established a project website where the resources on 3D content generation research are provided. The project page is available at https://github.com/hitcslj/Awesome-AIGC-3D.
- Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
- 4d-fy: Text-to-4d generation using hybrid score distillation sampling. arXiv preprint arXiv:2311.17984, 2023.
- Gaudi: A neural architect for immersive 3d scene generation. NeurIPS, 2022.
- Improving image generation with better captions. Computer Science, 2023.
- Face recognition based on fitting a 3d morphable model. TPAMI, 2003.
- Texfusion: Synthesizing 3d textures with text-guided image diffusion models. In ICCV, 2023.
- Text2shape: Generating shapes from natural language by learning joint embeddings. In ACCV, 2019.
- Towards efficient and photorealistic 3d human reconstruction: a brief survey. Visual Informatics, 2021.
- Sofgan: A portrait image generator with dynamic styling. TOG, 2022.
- gDNA: Towards generative detailed neural avatars. In CVPR, 2022.
- Scenetex: High-quality texture synthesis for indoor scenes via diffusion priors. arXiv preprint arXiv:2311.17261, 2023.
- Fantasia3d: Disentangling geometry and appearance for high-quality text-to-3d content creation. In ICCV, 2023.
- Scenedreamer: Unbounded 3d scene generation from 2d image collections. TPAMI, 2023.
- Text-to-3d using gaussian splatting. arXiv preprint arXiv:2309.16585, 2023.
- Sdfusion: Multimodal 3d shape completion, reconstruction, and generation. In CVPR, 2023.
- Luciddreamer: Domain-free generation of 3d gaussian splatting scenes. arXiv preprint arXiv:2311.13384, 2023.
- Smplicit: Topology-aware generative model for clothed people. In CVPR, 2021.
- Shapecrafter: A recursive text-conditioned 3d shape generation model. NeurIPS, 2022.
- Headsculpt: Crafting 3d head avatars with text. In NeurIPS, 2023.
- Text2room: Extracting textured 3d meshes from 2d text-to-image models. arXiv preprint arXiv:2303.11989, 2023.
- Avatarclip: Zero-shot text-driven generation and animation of 3d avatars. arXiv preprint arXiv:2205.08535, 2022.
- Headnerf: A real-time nerf-based parametric head model. In CVPR, 2022.
- Lrm: Large reconstruction model for single image to 3d. ICLR, 2024.
- Textfield3d: Towards enhancing open-vocabulary 3d generation with noisy text fields. arXiv preprint arXiv:2309.17175, 2023.
- Dreamcontrol: Control-based text-to-3d generation with 3d self-prior. arXiv preprint arXiv:2312.06439, 2023.
- Humannorm: Learning normal diffusion model for high-quality and realistic 3d human generation. arXiv preprint arXiv:2310.01406, 2023.
- Dreamwaltz: Make a scene with complex 3d animatable avatars. arXiv preprint arXiv:2305.12529, 2023.
- Shap-e: Generating conditional 3d implicit functions. arXiv preprint arXiv:2305.02463, 2023.
- 3d gaussian splatting for real-time radiance field rendering. TOG, 2023.
- Neuralfield-ldm: Scene generation with hierarchical latent diffusion models. In CVPR, 2023.
- Dreamhuman: Animatable 3d avatars from text. arXiv preprint arXiv:2306.09329, 2023.
- Generative ai meets 3d: A survey on text-to-3d in aigc era. arXiv preprint arXiv:2305.06131, 2023.
- Instant3d: Fast text-to-3d with sparse-view generation and large reconstruction model. arXiv preprint arXiv:2311.06214, 2023.
- Magic3d: High-resolution text-to-3d content creation. In CVPR, 2023.
- Deep learning for procedural content generation. Neural Computing and Applications, 2021.
- One-2-3-45++: Fast single image to 3d objects with consistent multi-view generation and 3d diffusion. arXiv preprint arXiv:2311.07885, 2023.
- One-2-3-45: Any single image to 3d mesh in 45 seconds without per-shape optimization. In NeurIPS, 2023.
- Zero-1-to-3: Zero-shot one image to 3d object. In ICCV, 2023.
- 3dall-e: Integrating text-to-image ai in 3d design workflows. In ACM DIS, 2023.
- Humangaussian: Text-driven 3d human generation with gaussian splatting. arXiv preprint arXiv:2311.17061, 2023.
- Syncdreamer: Generating multiview-consistent images from a single-view image. ICLR, 2024.
- Wonder3d: Single image to 3d using cross-domain diffusion. arXiv preprint arXiv:2310.15008, 2023.
- SMPL: A skinned multi-person linear model. In Seminal Graphics Papers: Pushing the Boundaries, Volume 2. 2023.
- Point-e: A system for generating 3d point clouds from complex prompts. arXiv preprint arXiv:2212.08751, 2022.
- Dreamfusion: Text-to-3d using 2d diffusion. In ICLR, 2023.
- Dreamgaussian4d: Generative 4d gaussian splatting. arXiv preprint arXiv:2312.17142, 2023.
- Texture: Text-guided texturing of 3d shapes. In SIGGRAPH, 2023.
- PIFu: Pixel-aligned implicit function for high-resolution clothed human digitization. In ICCV, 2019.
- SCULPT: Shape-conditioned unpaired learning of pose-dependent clothed and textured human meshes. arXiv preprint arXiv:2308.10638, 2023.
- Controlroom3d: Room generation using semantic proxy rooms. arXiv preprint arXiv:2312.05208, 2023.
- Graf: Generative radiance fields for 3d-aware image synthesis. NeurIPS, 2020.
- Deep generative models on 3d representations: A survey. arXiv preprint arXiv:2210.15663, 2022.
- Mvdream: Multi-view diffusion for 3d generation. ICLR, 2024.
- Text-to-4d dynamic scene generation. arXiv preprint arXiv:2301.11280, 2023.
- Dreamcraft3d: Hierarchical 3d generation with bootstrapped diffusion prior. arXiv preprint arXiv:2310.16818, 2023.
- Mvdiffusion: Enabling holistic multi-view image generation with correspondence-aware diffusion, 2023.
- Dreamgaussian: Generative gaussian splatting for efficient 3d content creation. ICLR, 2024.
- Rodin: A generative model for sculpting 3d digital avatars using diffusion. In CVPR, 2023.
- Prolificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation. In NeurIPS, 2023.
- Gpt-4v (ision) is a human-aligned evaluator for text-to-3d generation. arXiv preprint arXiv:2401.04092, 2024.
- Get3DHuman: Lifting StyleGAN-Human into a 3D generative model using pixel-aligned reconstruction priors. In ICCV, 2023.
- Dmv3d: Denoising multi-view diffusion using 3d large reconstruction model. ICLR, 2024.
- 4dgen: Grounded 4d content generation with spatial-temporal consistency. arXiv preprint arXiv:2312.17225, 2023.
- Dreamface: Progressive generation of animatable 3d faces under text guidance. arXiv preprint arXiv:2304.03117, 2023.
- Scenewiz3d: Towards text-guided 3d scene composition. arXiv preprint arXiv:2312.08885, 2023.
- Animate124: Animating one image to 4d dynamic scene. arXiv preprint arXiv:2311.14603, 2023.
- Jian Liu (404 papers)
- Xiaoshui Huang (55 papers)
- Tianyu Huang (28 papers)
- Lu Chen (245 papers)
- Yuenan Hou (31 papers)
- Shixiang Tang (48 papers)
- Ziwei Liu (368 papers)
- Wanli Ouyang (358 papers)
- Wangmeng Zuo (279 papers)
- Junjun Jiang (97 papers)
- Xianming Liu (121 papers)