Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
86 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
53 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

G3DR: Generative 3D Reconstruction in ImageNet (2403.00939v3)

Published 1 Mar 2024 in cs.CV and cs.GR

Abstract: We introduce a novel 3D generative method, Generative 3D Reconstruction (G3DR) in ImageNet, capable of generating diverse and high-quality 3D objects from single images, addressing the limitations of existing methods. At the heart of our framework is a novel depth regularization technique that enables the generation of scenes with high-geometric fidelity. G3DR also leverages a pretrained language-vision model, such as CLIP, to enable reconstruction in novel views and improve the visual realism of generations. Additionally, G3DR designs a simple but effective sampling procedure to further improve the quality of generations. G3DR offers diverse and efficient 3D asset generation based on class or text conditioning. Despite its simplicity, G3DR is able to beat state-of-theart methods, improving over them by up to 22% in perceptual metrics and 90% in geometry scores, while needing only half of the training time. Code is available at https://github.com/preddy5/G3DR

Definition Search Book Streamline Icon: https://streamlinehq.com
References (68)
  1. Renderdiffusion: Image diffusion for 3d reconstruction, inpainting and generation. In CVPR, 2023.
  2. Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096, 2018.
  3. Large scale GAN training for high fidelity natural image synthesis. In ICLR, 2019.
  4. Pix2nerf: Unsupervised conditional π𝜋\piitalic_π-gan for single image to neural radiance fields translation. In CVPR, 2022.
  5. Efficient geometry-aware 3d generative adversarial networks. In CVPR, 2022.
  6. Pi-gan: Periodic implicit generative adversarial networks for 3d-aware image synthesis. In CVPR, 2021.
  7. Imagenet: A large-scale hierarchical image database. In CVPR, 2009.
  8. Depth-supervised NeRF: Fewer views and faster training for free. In CVPR, 2022.
  9. GRAM: generative radiance manifolds for 3d-aware image generation. In CVPR, 2022.
  10. Unconstrained scene generation with locally conditioned radiance fields. In ICCV, 2021.
  11. Diffusion models beat gans on image synthesis. In NeurIPS, 2021.
  12. Hyperdiffusion: Generating implicit neural fields with weight-space diffusion. arXiv preprint arXiv:2303.17015, 2023.
  13. Generative adversarial nets. In NeurIPS, 2014.
  14. Learning controllable 3d diffusion models from single-view images, 2023.
  15. Stylenerf: A style-based 3d aware generator for high-resolution image synthesis. In ICLR, 2022.
  16. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In NeurIPS, 2017.
  17. Denoising diffusion probabilistic models. In NeurIPS, 2020.
  18. Tri-miprf: Tri-mip representation for efficient anti-aliasing neural radiance fields. In ICCV, 2023.
  19. Zero-shot text-guided object generation with dream fields. In CVPR, 2022.
  20. Putting nerf on a diet: Semantically consistent few-shot view synthesis. In ICCV, 2021.
  21. Holodiffusion: Training a 3d diffusion model using 2d images. In CVPR, 2023.
  22. Adam: A method for stochastic optimization. In ICLR, 2015.
  23. Nerf-vae: A geometry aware 3d scene generative model. In ICML, 2021.
  24. Autoregressive image generation using residual quantization. In CVPR, 2022.
  25. SGDR: stochastic gradient descent with warm restarts. In ICLR, 2017.
  26. Boosting monocular depth estimation models to high-resolution via content-adaptive multi-resolution merging. In CVPR, 2021.
  27. Nerf: Representing scenes as neural radiance fields for view synthesis. In ECCV, 2020.
  28. Self-distilled stylegan: Towards generation from internet photos. In ACM SIGGRAPH 2022 Conference Proceedings, pages 1–9, 2022.
  29. Self-distilled stylegan: Towards generation from internet photos. In SIGGRAPH, 2022.
  30. Hologan: Unsupervised learning of 3d representations from natural images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7588–7597, 2019.
  31. Regnerf: Regularizing neural radiance fields for view synthesis from sparse inputs. In CVPR, 2022.
  32. GIRAFFE: representing scenes as compositional generative neural feature fields. In CVPR, 2021.
  33. Stylesdf: High-resolution 3d-consistent image and geometry generation. In CVPR, 2022.
  34. Floaters No More: Radiance Field Gradient Scaling for Improved Near-Camera Training. In Eurographics Symposium on Rendering, 2023.
  35. Dreamfusion: Text-to-3d using 2d diffusion. In ICLR, 2023.
  36. Learning transferable visual models from natural language supervision. In ICML, 2021.
  37. Lolnerf: Learn from one look. In CVPR, 2022.
  38. Dense depth priors for neural radiance fields from sparse input views. In CVPR, 2022.
  39. High-resolution image synthesis with latent diffusion models. In CVPR, 2022.
  40. Improved techniques for training gans. In NeurIPS, 2016.
  41. K-planes: Explicit radiance fields in space, time, and appearance. In CVPR, 2023.
  42. VQ3D: learning a 3d-aware generative model on imagenet. In ICCV, 2023.
  43. Stylegan-xl: Scaling stylegan to large diverse datasets. In SIGGRAPH, 2022.
  44. GRAF: generative radiance fields for 3d-aware image synthesis. In NeurIPS, 2020.
  45. Voxgraf: Fast 3d-aware image synthesis with sparse voxel grids. In NeurIPS, 2022.
  46. 3d-aware indoor scene synthesis with depth priors. In ECCV, 2022.
  47. 3d photography using context-aware layered depth inpainting. In CVPR, 2020.
  48. 3d neural field generation using triplane diffusion. In CVPR, 2023.
  49. 3d generation on imagenet. In ICLR, 2023.
  50. Epigraf: Rethinking training of 3d gans. In NeurIPS, 2022.
  51. Därf: Boosting radiance fields from sparse inputs with monocular depth adaptation. In NeurIPS, 2023.
  52. LDM3D: latent diffusion model for 3d. CoRR, abs/2305.10853, 2023.
  53. Neural geometric level of detail: Real-time rendering with implicit 3D shapes. 2021.
  54. Disentangled3d: Learning a 3d generative model with disentangled geometry and appearance from monocular images. In CVPR, 2022.
  55. Scade: Nerfs from space carving with ambiguity-aware depth estimates. In CVPR, 2023.
  56. Clip-nerf: Text-and-image driven manipulation of neural radiance fields. In CVPR, 2022.
  57. 3d-aware image generation using 2d diffusion models. In ICCV, 2023.
  58. 3d-aware image synthesis via learning structural and textural representations. In CVPR, 2022.
  59. GIRAFFE HD: A high-resolution 3d-aware generative model. In CVPR, 2022.
  60. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365, 2015.
  61. LSUN: construction of a large-scale image dataset using deep learning with humans in the loop. abs/1506.03365, 2015.
  62. Monosdf: Exploring monocular geometric cues for neural implicit surface reconstruction. In NeurIPS, 2022.
  63. The unreasonable effectiveness of deep features as a perceptual metric. In CVPR, 2018.
  64. Multi-view consistent generative adversarial networks for 3d-aware image synthesis. In CVPR, 2022.
  65. Multi-view consistent generative adversarial networks for compositional 3d-aware image synthesis. IJCV, 2023.
  66. Image super-resolution using very deep residual channel attention networks. In ECCV, 2018.
  67. CIPS-3D: A 3d-aware generator of gans based on conditionally-independent pixel synthesis. CoRR, abs/2110.09788, 2021.
  68. Anti-aliased neural implicit surfaces with encoding level of detail. arXiv preprint arXiv:2309.10336, 2023.
Citations (4)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com