- The paper introduces DGE, a novel method that achieves multi-view consistent 3D editing by modifying existing image editors with 3D geometry cues.
- The paper improves efficiency by directly optimizing 3D representations using 3D Gaussian Splatting, eliminating costly iterative updates.
- The paper enables selective editing of specific scene sections, ensuring precise control and coherent results across multiple views.
The paper "DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing" addresses the challenge of 3D object and scene editing based on open-ended language instructions. Traditional approaches in this area typically rely on 2D image generators or editors to guide the 3D editing process. However, such methods face significant efficiency issues, primarily due to the need for updating computationally expensive 3D representations like neural radiance fields. Furthermore, these methods often struggle with multi-view consistency, as the 2D models guiding the edits do not inherently support consistent editing across different viewpoints.
To overcome these limitations, the authors introduce the Direct Gaussian Editor (DGE). This innovative method improves the 3D editing process through two key strategies:
- Multi-View Consistency in Image Editing: The first strategy involves modifying existing high-quality image editors, such as InstructPix2Pix, to ensure multi-view consistency. Instead of relying on standard training methods, the authors propose a novel training-free approach. This approach leverages the underlying 3D geometry cues of the scene, enabling edited images to remain consistent across different views of the 3D object. This ensures that edits made to one view of the object are accurately reflected in all other views, thereby maintaining visual coherence.
- Efficient 3D Object Representation Optimization: The second strategy focuses on optimizing the 3D object representation efficiently. Once a sequence of multi-view consistent edited images is obtained, the authors utilize 3D Gaussian Splatting to directly optimize the 3D object representation. This method avoids the iterative and incremental application of edits, which is a bottleneck in traditional approaches. By directly optimizing the 3D representation, DGE achieves significant efficiency gains, making the editing process faster and more effective.
Additionally, DGE introduces the capability for selective editing of specific parts of the scene. This nuanced editing allows users to modify only desired sections of the 3D object or scene without affecting the rest, providing greater control and precision.
In summary, DGE presents a significant advancement in the field of 3D editing by addressing the challenges of efficiency and multi-view consistency. The integration of high-quality image editors with underlying 3D geometry cues and the use of 3D Gaussian Splatting for direct optimization are key innovations that make DGE a promising tool for 3D object and scene editing based on open-ended language instructions.