GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting (2311.14521v4)
Abstract: 3D editing plays a crucial role in many areas such as gaming and virtual reality. Traditional 3D editing methods, which rely on representations like meshes and point clouds, often fall short in realistically depicting complex scenes. On the other hand, methods based on implicit 3D representations, like Neural Radiance Field (NeRF), render complex scenes effectively but suffer from slow processing speeds and limited control over specific scene areas. In response to these challenges, our paper presents GaussianEditor, an innovative and efficient 3D editing algorithm based on Gaussian Splatting (GS), a novel 3D representation. GaussianEditor enhances precision and control in editing through our proposed Gaussian semantic tracing, which traces the editing target throughout the training process. Additionally, we propose Hierarchical Gaussian splatting (HGS) to achieve stabilized and fine results under stochastic generative guidance from 2D diffusion models. We also develop editing strategies for efficient object removal and integration, a challenging task for existing methods. Our comprehensive experiments demonstrate GaussianEditor's superior control, efficacy, and rapid performance, marking a significant advancement in 3D editing. Project Page: https://buaacyw.github.io/gaussian-editor/
- Sine: Semantic-driven image-based nerf editing with prior-guided editing field. In CVPR 2023, pages 20919–20929, 2023.
- Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5855–5864, 2021.
- Mip-nerf 360: Unbounded anti-aliased neural radiance fields. CVPR, 2022.
- Instructpix2pix: Learning to follow image editing instructions. arXiv preprint arXiv:2211.09800, 2022.
- It3d: Improved text-to-3d generation with explicit view synthesis. arXiv preprint arXiv:2308.11473, 2023a.
- Mobilenerf: Exploiting the polygon rasterization pipeline for efficient neural field rendering on mobile architectures. arXiv preprint arXiv:2208.00277, 2022.
- Text-to-3d using gaussian splatting. arXiv preprint arXiv:2309.16585, 2023b.
- Progressive3d: Progressively local editing for text-to-3d content creation with complex semantic prompts. arXiv preprint arXiv:2310.11784, 2023.
- Stylegan-nada: Clip-guided domain adaptation of image generators. ACM Transactions on Graphics (TOG), 41(4):1–13, 2022.
- Textdeformer: Geometry manipulation using text guidance. arXiv preprint arXiv:2304.13348, 2023.
- threestudio: A unified framework for 3d content generation. https://github.com/threestudio-project/threestudio, 2023.
- Instruct-nerf2nerf: Editing 3d scenes with instructions. arXiv preprint arXiv:2303.12789, 2023.
- Baking neural radiance fields for real-time view synthesis. ICCV, 2021.
- Delta denoising score. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2328–2337, 2023.
- 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics (ToG), 42(4):1–14, 2023a.
- 3d gaussian splatting for real-time radiance field rendering. ToG, 42(4):1–14, 2023b.
- 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, 42(4), 2023c.
- Segment anything. arXiv preprint arXiv:2304.02643, 2023.
- Decomposing nerf for editing via feature field distillation. arXiv preprint arXiv:2205.15585, 2022.
- Climatenerf: Physically-based neural rendering for extreme climate synthesis. arXiv e-prints, pages arXiv–2211, 2022.
- Neuralangelo: High-fidelity neural surface reconstruction. In CVPR, 2023.
- Magic3d: High-resolution text-to-3d content creation. In CVPR, pages 300–309, 2023.
- Nerf-in: Free-form nerf inpainting with rgb-d priors. arXiv preprint arXiv:2206.04901, 2022.
- Editing conditional radiance fields. In ICCV 2021, pages 5773–5783, 2021.
- Wonder3d: Single image to 3d using cross-domain diffusion. arXiv preprint arXiv:2310.15008, 2023.
- Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis. arXiv preprint arXiv:2308.09713, 2023.
- Sked: Sketch-guided text-based 3d editing. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14607–14619, 2023.
- Nerf: Representing scenes as neural radiance fields for view synthesis. In ECCV, 2020.
- Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph., 41(4):102:1–102:15, 2022.
- Neural articulated radiance field. In ICCV 2021, pages 5762–5772, 2021.
- Ed-nerf: Efficient text-guided editing of 3d scene using latent space nerf. arXiv preprint arXiv:2310.02712, 2023.
- Hypernerf: A higher-dimensional representation for topologically varying neural radiance fields. arXiv preprint arXiv:2106.13228, 2021.
- Neural body: Implicit neural representations with structured latent codes for novel view synthesis of dynamic humans. In CVPR 2021, pages 9054–9063, 2021.
- Sdxl: Improving latent diffusion models for high-resolution image synthesis. arXiv preprint arXiv:2307.01952, 2023.
- Dreamfusion: Text-to-3d using 2d diffusion. arXiv preprint arXiv:2209.14988, 2022.
- Dreamfusion: Text-to-3d using 2d diffusion. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023.
- Dreambooth3d: Subject-driven text-to-3d generation. arXiv preprint arXiv:2303.13508, 2023.
- Vision transformers for dense prediction. In Proceedings of the IEEE/CVF international conference on computer vision, pages 12179–12188, 2021.
- High-resolution image synthesis with latent diffusion models. In CVPR, pages 10684–10695, 2022.
- Plenoxels: Radiance fields without neural networks. In CVPR, 2022.
- Structure-from-motion revisited. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4104–4113, 2016.
- Vox-e: Text-guided voxel editing of 3d objects. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 430–440, 2023.
- Control4d: Dynamic portrait editing by learning 4d gan from 2d diffusion-based editor. arXiv preprint arXiv:2305.20082, 2023.
- Dreamgaussian: Generative gaussian splatting for efficient 3d content creation. arXiv preprint arXiv:2309.16653, 2023.
- Clip-nerf: Text-and-image driven manipulation of neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3835–3844, 2022.
- Nerf-art: Text-driven neural radiance fields stylization. IEEE Transactions on Visualization and Computer Graphics, 2023.
- 4d gaussian splatting for real-time dynamic scene rendering. arXiv preprint arXiv:2310.08528, 2023.
- Deforming radiance fields with cages. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXIII, pages 159–175. Springer, 2022.
- Neumesh: Learning disentangled neural mesh-based implicit field for geometry and texture editing. In ECCV 2022, pages 597–614. Springer, 2022.
- Deformable 3d gaussians for high-fidelity monocular dynamic scene reconstruction. arXiv preprint arXiv:2309.13101, 2023a.
- Real-time photorealistic dynamic scene representation and rendering with 4d gaussian splatting. arXiv preprint arXiv:2310.10642, 2023b.
- Gaussiandreamer: Fast generation from text to 3d gaussian splatting with point cloud priors. arXiv preprint arXiv:2310.08529, 2023.
- Differentiable surface splatting for point-based geometry processing. ACM Transactions on Graphics (TOG), 38(6):1–14, 2019.
- Nerf-editing: geometry editing of neural radiance fields. In CVPR 2022, pages 18353–18364, 2022.
- Nerf++: Analyzing and improving neural radiance fields. arXiv preprint arXiv:2010.07492, 2020.
- In-place scene labelling and understanding with implicit scene representation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 15838–15847, 2021.
- Dreameditor: Text-driven 3d scene editing with neural fields. arXiv preprint arXiv:2306.13455, 2023.
- Surface splatting. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pages 371–378, 2001.