Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FocalDreamer: Text-driven 3D Editing via Focal-fusion Assembly (2308.10608v2)

Published 21 Aug 2023 in cs.CV, cs.GR, and cs.LG

Abstract: While text-3D editing has made significant strides in leveraging score distillation sampling, emerging approaches still fall short in delivering separable, precise and consistent outcomes that are vital to content creation. In response, we introduce FocalDreamer, a framework that merges base shape with editable parts according to text prompts for fine-grained editing within desired regions. Specifically, equipped with geometry union and dual-path rendering, FocalDreamer assembles independent 3D parts into a complete object, tailored for convenient instance reuse and part-wise control. We propose geometric focal loss and style consistency regularization, which encourage focal fusion and congruent overall appearance. Furthermore, FocalDreamer generates high-fidelity geometry and PBR textures which are compatible with widely-used graphics engines. Extensive experiments have highlighted the superior editing capabilities of FocalDreamer in both quantitative and qualitative evaluations.

"FocalDreamer: Text-driven 3D Editing via Focal-fusion Assembly" addresses the challenges in the field of text-driven 3D editing, particularly focusing on achieving separable, precise, and consistent outcomes necessary for content creation. Traditional methods that employ score distillation sampling have struggled to deliver these required attributes. FocalDreamer introduces a novel framework that adeptly combines base shapes with editable components based on text prompts, enabling fine-grained editing focused on specific regions.

The framework utilizes a combination of geometry union and dual-path rendering to amalgamate independent 3D parts into a cohesive whole. This approach is designed to facilitate instance reuse and allow for detailed part-wise control, which is crucial for efficient and versatile content creation workflows.

Central to FocalDreamer's innovation are two key concepts: geometric focal loss and style consistency regularization. The geometric focal loss aims to enhance focal fusion, ensuring that the merged parts integrate seamlessly. Meanwhile, style consistency regularization works to maintain a congruent overall appearance, ensuring that the edited regions are stylistically consistent with the rest of the object.

An important aspect of FocalDreamer is its ability to generate high-fidelity geometry and physically-based rendering (PBR) textures. This compatibility with widely-used graphics engines ensures that the generated content can be easily integrated into existing pipelines, making the framework both practical and broadly applicable.

The paper reports extensive experiments that demonstrate FocalDreamer's superior editing capabilities. These experiments include both quantitative metrics and qualitative evaluations, highlighting the framework's ability to deliver high-quality, precise, and consistent 3D content based on textual descriptions. As such, FocalDreamer represents a significant advancement in the field of text-driven 3D content creation, providing a robust tool for artists and designers to achieve detailed and coherent 3D edits.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Yuhan Li (49 papers)
  2. Yishun Dou (8 papers)
  3. Yue Shi (65 papers)
  4. Yu Lei (56 papers)
  5. Xuanhong Chen (16 papers)
  6. Yi Zhang (994 papers)
  7. Peng Zhou (136 papers)
  8. Bingbing Ni (95 papers)
Citations (48)