Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SKED: Sketch-guided Text-based 3D Editing (2303.10735v4)

Published 19 Mar 2023 in cs.CV and cs.GR

Abstract: Text-to-image diffusion models are gradually introduced into computer graphics, recently enabling the development of Text-to-3D pipelines in an open domain. However, for interactive editing purposes, local manipulations of content through a simplistic textual interface can be arduous. Incorporating user guided sketches with Text-to-image pipelines offers users more intuitive control. Still, as state-of-the-art Text-to-3D pipelines rely on optimizing Neural Radiance Fields (NeRF) through gradients from arbitrary rendering views, conditioning on sketches is not straightforward. In this paper, we present SKED, a technique for editing 3D shapes represented by NeRFs. Our technique utilizes as few as two guiding sketches from different views to alter an existing neural field. The edited region respects the prompt semantics through a pre-trained diffusion model. To ensure the generated output adheres to the provided sketches, we propose novel loss functions to generate the desired edits while preserving the density and radiance of the base instance. We demonstrate the effectiveness of our proposed method through several qualitative and quantitative experiments. https://sked-paper.github.io/

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Aryan Mikaeili (3 papers)
  2. Or Perel (9 papers)
  3. Mehdi Safaee (2 papers)
  4. Daniel Cohen-Or (172 papers)
  5. Ali Mahdavi-Amiri (31 papers)
Citations (55)

Summary

The paper "SKED: Sketch-guided Text-based 3D Editing" addresses the challenges of interactive 3D shape editing through a novel approach that combines user-guided sketches with text-based pipelines grounded in diffusion models. Traditional Text-to-3D pipelines, which translate textual descriptions into 3D shapes using Neural Radiance Fields (NeRF), fall short in allowing precise, localized modifications essential for interactive design.

Core Contribution:

SKED introduces a technique where minimal user input—specifically, two guiding sketches from different perspectives—can be used to precisely manipulate a neural field representation of a 3D shape. The principal innovation lies in the integration of these sketches with a pre-trained text-to-image diffusion model to guide the editing of 3D shapes while maintaining their underlying structure and semantic integrity.

Methodology:

  1. Sketch Integration: The system leverages as few as two sketches from distinct views to guide the modification process.
  2. Diffusion Model Utilization: A pre-trained diffusion model ensures that the semantics derived from text prompts are respected throughout the editing process.
  3. Loss Functions: The authors propose novel loss functions tailored to enforce adherence to the user's sketches while preserving the original density and radiance characteristics of the NeRF.
    • These loss functions balance between the fidelity to the sketches and the consistency of the rendered images across various views, ensuring semantically meaningful and visually coherent edits.

Experimental Validation:

The paper validates the effectiveness of SKED through both qualitative and quantitative assessments:

  • Qualitative Results: Visual examples showcase the technique's ability to make detailed and precise edits on various 3D models, demonstrating a high degree of control offered by sketch-guided modifications.
  • Quantitative Metrics: Numerical evaluations emphasize the accuracy and consistency of the model in generating the edited 3D shapes.

Impact and Applications:

SKED opens new possibilities for intuitive and interactive 3D content creation, making it easier for users to perform complex edits with simple sketches, which has substantial implications for fields like computer graphics, game development, virtual reality, and more.

By addressing the limitations of traditional textual interfaces in 3D editing, SKED represents a significant advancement in user-guided, interactive manipulation of neural radiance fields, bridging the gap between textual and visual input methods.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets