Locally conditioned diffusion is a new approach to compositional scene diffusion, providing control over semantic parts using text prompts and bounding boxes.
This method enables higher fidelity 3D scene generation compared to existing approaches, using a score distillation sampling-based pipeline.
Key terms:
Compositional scene diffusion: The process of generating complex 3D scenes by combining multiple components or objects.
Locally conditioned diffusion: An approach that provides control over semantic parts of a 3D scene using text prompts and bounding boxes, ensuring seamless transitions between parts.
Text prompts: Input text that helps guide the generation of a 3D scene.
Bounding boxes: Rectangular regions that define the boundaries of objects within a 3D scene.
Score distillation sampling: A pipeline used in the text-to-3D synthesis process that optimizes the Voxel NeRF representation of the 3D scene.