Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PI3D: Efficient Text-to-3D Generation with Pseudo-Image Diffusion (2312.09069v2)

Published 14 Dec 2023 in cs.CV

Abstract: Diffusion models trained on large-scale text-image datasets have demonstrated a strong capability of controllable high-quality image generation from arbitrary text prompts. However, the generation quality and generalization ability of 3D diffusion models is hindered by the scarcity of high-quality and large-scale 3D datasets. In this paper, we present PI3D, a framework that fully leverages the pre-trained text-to-image diffusion models' ability to generate high-quality 3D shapes from text prompts in minutes. The core idea is to connect the 2D and 3D domains by representing a 3D shape as a set of Pseudo RGB Images. We fine-tune an existing text-to-image diffusion model to produce such pseudo-images using a small number of text-3D pairs. Surprisingly, we find that it can already generate meaningful and consistent 3D shapes given complex text descriptions. We further take the generated shapes as the starting point for a lightweight iterative refinement using score distillation sampling to achieve high-quality generation under a low budget. PI3D generates a single 3D shape from text in only 3 minutes and the quality is validated to outperform existing 3D generative models by a large margin.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Ying-Tian Liu (7 papers)
  2. Guan Luo (6 papers)
  3. Heyi Sun (2 papers)
  4. Wei Yin (58 papers)
  5. Yuan-Chen Guo (31 papers)
  6. Song-Hai Zhang (41 papers)
Citations (9)

Summary

We haven't generated a summary for this paper yet.