LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes (2311.13384v2)
Abstract: With the widespread usage of VR devices and contents, demands for 3D scene generation techniques become more popular. Existing 3D scene generation models, however, limit the target scene to specific domain, primarily due to their training strategies using 3D scan dataset that is far from the real-world. To address such limitation, we propose LucidDreamer, a domain-free scene generation pipeline by fully leveraging the power of existing large-scale diffusion-based generative model. Our LucidDreamer has two alternate steps: Dreaming and Alignment. First, to generate multi-view consistent images from inputs, we set the point cloud as a geometrical guideline for each image generation. Specifically, we project a portion of point cloud to the desired view and provide the projection as a guidance for inpainting using the generative model. The inpainted images are lifted to 3D space with estimated depth maps, composing a new points. Second, to aggregate the new points into the 3D scene, we propose an aligning algorithm which harmoniously integrates the portions of newly generated 3D scenes. The finally obtained 3D scene serves as initial points for optimizing Gaussian splats. LucidDreamer produces Gaussian splats that are highly-detailed compared to the previous 3D scene generation methods, with no constraint on domain of the target scene. Project page: https://luciddreamer-cvlab.github.io/
- Learning representations and generative models for 3d point clouds. In International conference on machine learning, 2018.
- Zoedepth: Zero-shot transfer by combining relative and metric depth. arXiv preprint arXiv:2302.12288, 2023.
- pi-gan: Periodic implicit generative adversarial networks for 3d-aware image synthesis. In CVPR, 2021.
- Efficient geometry-aware 3d generative adversarial networks. In CVPR, 2022.
- Tensorf: Tensorial radiance fields. In ECCV, 2022.
- Single-stage diffusion nerf: A unified approach to 3d generation and reconstruction. arXiv preprint arXiv:2304.06714, 2023a.
- Mobilenerf: Exploiting the polygon rasterization pipeline for efficient neural field rendering on mobile architectures. In CVPR, 2023b.
- Set-the-scene: Global-local training for generating controllable nerf scenes. arXiv preprint arXiv:2303.13450, 2023.
- Scannet: Richly-annotated 3d reconstructions of indoor scenes. In CVPR, 2017.
- Cvxnet: Learnable convex decomposition. In CVPR, 2020.
- From data to functa: Your data point is a function and you can treat it like one. arXiv preprint arXiv:2201.12204, 2022.
- Extended bayesian information criteria for gaussian graphical models. 2010.
- Scenescape: Text-driven consistent scene generation. arXiv preprint arXiv:2302.01133, 2023.
- Plenoxels: Radiance fields without neural networks. In CVPR, 2022.
- Fastnerf: High-fidelity neural rendering at 200fps. In ICCV, 2021.
- Learning shape templates with structured implicit functions. In ICCV, 2019.
- Generative adversarial nets. NIPS, 2014.
- CLIPScore: a reference-free evaluation metric for image captioning. In EMNLP, 2021.
- Denoising diffusion probabilistic models. NeurIPS, 2020.
- 3d gaussian splatting for real-time radiance field rendering. ACM ToG, 2023a.
- 3d gaussian splatting for real-time radiance field rendering. ACM ToG, 2023b.
- Rgbd2: Generative scene synthesis via incremental view inpainting using rgbd diffusion models. In CVPR, 2023.
- LAVIS: A one-stop library for language-vision intelligence. In ACL, 2023.
- Neural sparse voxel fields. NIPS, 2020.
- Diffusion probabilistic models for 3d point cloud generation. In CVPR, 2021.
- Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 2021.
- Real-time neural radiance caching for path tracing. arXiv preprint arXiv:2106.12372, 2021.
- Instant neural graphics primitives with a multiresolution hash encoding. TOG, 2022.
- Hologan: Unsupervised learning of 3d representations from natural images. In ICCV, 2019.
- Blockgan: Learning 3d object-aware scene representations from unlabelled images. NeurIPS, 2020.
- Giraffe: Representing scenes as compositional generative neural feature fields. In CVPR, pages 11453–11464, 2021.
- Deepsdf: Learning continuous signed distance functions for shape representation. In CVPR, 2019.
- Superquadrics revisited: Learning 3d shape parsing beyond cuboids. In CVPR, 2019.
- Dreamfusion: Text-to-3d using 2d diffusion. arXiv preprint arXiv:2209.14988, 2022.
- Magic123: One image to high-quality 3d object generation using both 2d and 3d diffusion priors. arXiv preprint arXiv:2306.17843, 2023.
- Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
- Zero-shot text-to-image generation. In ICML, 2021.
- Kilonerf: Speeding up neural radiance fields with thousands of tiny mlps. In ICCV, 2021.
- High-resolution image synthesis with latent diffusion models. In CVPR, 2022.
- Structure-from-motion revisited. In CVPR, 2016.
- Pixelwise view selection for unstructured multi-view stereo. In ECCV, 2016.
- Graf: Generative radiance fields for 3d-aware image synthesis. In NIPS, 2020.
- 3d point cloud generative adversarial network based on tree structured graph convolutions. In ICCV, 2019.
- 3d neural field generation using triplane diffusion. In CVPR, 2023.
- Indoor segmentation and support inference from rgbd images. In ECCV, 2012.
- Implicit neural representations with periodic activation functions. NeurIPS, 2020.
- Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020.
- Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction. In CVPR, 2022.
- Neural geometric level of detail: Real-time rendering with implicit 3d shapes. In CVPR, 2021.
- Dreamgaussian: Generative gaussian splatting for efficient 3d content creation. arXiv preprint arXiv:2309.16653, 2023a.
- Mvdiffusion: Enabling holistic multi-view image generation with correspondence-aware diffusion. arXiv preprint arXiv:2307.01097, 2023b.
- Learning shape abstractions by assembling volumetric primitives. In CVPR, 2017.
- Exploring clip for assessing the look and feel of images. In AAAI, 2023.
- Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. In NIPS, 2016.
- Point-nerf: Point-based neural radiance fields. In CVPR, 2022.
- 3dias: 3d shape reconstruction with implicit algebraic surfaces. In ICCV, 2021.
- Generative neural fields by mixtures of neural implicit functions. arXiv preprint arXiv:2310.19464, 2023.
- Plenoctrees for real-time rendering of neural radiance fields. In ICCV, 2021.
- Lion: Latent point diffusion models for 3d shape generation. arXiv preprint arXiv:2210.06978, 2022.
- 3d shape generation and completion through point-voxel diffusion. In ICCV, 2021.
- Jaeyoung Chung (8 papers)
- Suyoung Lee (13 papers)
- Hyeongjin Nam (8 papers)
- Jaerin Lee (6 papers)
- Kyoung Mu Lee (107 papers)