Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MVHuman: Tailoring 2D Diffusion with Multi-view Sampling For Realistic 3D Human Generation (2312.10120v1)

Published 15 Dec 2023 in cs.CV

Abstract: Recent months have witnessed rapid progress in 3D generation based on diffusion models. Most advances require fine-tuning existing 2D Stable Diffsuions into multi-view settings or tedious distilling operations and hence fall short of 3D human generation due to the lack of diverse 3D human datasets. We present an alternative scheme named MVHuman to generate human radiance fields from text guidance, with consistent multi-view images directly sampled from pre-trained Stable Diffsuions without any fine-tuning or distilling. Our core is a multi-view sampling strategy to tailor the denoising processes of the pre-trained network for generating consistent multi-view images. It encompasses view-consistent conditioning, replacing the original noises with ``consistency-guided noises'', optimizing latent codes, as well as utilizing cross-view attention layers. With the multi-view images through the sampling process, we adopt geometry refinement and 3D radiance field generation followed by a subsequent neural blending scheme for free-view rendering. Extensive experiments demonstrate the efficacy of our method, as well as its superiority to state-of-the-art 3D human generation methods.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Suyi Jiang (4 papers)
  2. Haimin Luo (10 papers)
  3. Haoran Jiang (12 papers)
  4. Ziyu Wang (137 papers)
  5. Jingyi Yu (171 papers)
  6. Lan Xu (102 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.