G-NeRF: Geometry-enhanced Novel View Synthesis from Single-View Images (2404.07474v1)
Abstract: Novel view synthesis aims to generate new view images of a given view image collection. Recent attempts address this problem relying on 3D geometry priors (e.g., shapes, sizes, and positions) learned from multi-view images. However, such methods encounter the following limitations: 1) they require a set of multi-view images as training data for a specific scene (e.g., face, car or chair), which is often unavailable in many real-world scenarios; 2) they fail to extract the geometry priors from single-view images due to the lack of multi-view supervision. In this paper, we propose a Geometry-enhanced NeRF (G-NeRF), which seeks to enhance the geometry priors by a geometry-guided multi-view synthesis approach, followed by a depth-aware training. In the synthesis process, inspired that existing 3D GAN models can unconditionally synthesize high-fidelity multi-view images, we seek to adopt off-the-shelf 3D GAN models, such as EG3D, as a free source to provide geometry priors through synthesizing multi-view data. Simultaneously, to further improve the geometry quality of the synthetic data, we introduce a truncation method to effectively sample latent codes within 3D GAN models. To tackle the absence of multi-view supervision for single-view images, we design the depth-aware training approach, incorporating a depth-aware discriminator to guide geometry priors through depth maps. Experiments demonstrate the effectiveness of our method in terms of both qualitative and quantitative results.
- Self-consuming generative models go mad. arXiv preprint arXiv:2307.01850, 2023.
- Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In IEEE Int. Conf. Comput. Vis., pages 5855–5864, 2021.
- Demystifying mmd gans. In Int. Conf. Learn. Represent., 2018.
- Large scale gan training for high fidelity natural image synthesis. In Int. Conf. Learn. Represent., 2019.
- Pix2NeRF: Unsupervised conditional p-gan for single image to neural radiance fields translation. In IEEE Conf. Comput. Vis. Pattern Recog., pages 3981–3990, 2022.
- pi-gan: Periodic implicit generative adversarial networks for 3d-aware image synthesis. In IEEE Conf. Comput. Vis. Pattern Recog., pages 5799–5809, 2021.
- Efficient geometry-aware 3d generative adversarial networks. In IEEE Conf. Comput. Vis. Pattern Recog., pages 16123–16133, 2022.
- Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012, 2015.
- Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo. In IEEE Int. Conf. Comput. Vis., pages 14124–14133, 2021.
- Stargan v2: Diverse image synthesis for multiple domains. In IEEE Conf. Comput. Vis. Pattern Recog., pages 8188–8197, 2020.
- Arcface: Additive angular margin loss for deep face recognition. In IEEE Conf. Comput. Vis. Pattern Recog., pages 4690–4699, 2019a.
- Depth-supervised nerf: Fewer views and faster training for free. In IEEE Conf. Comput. Vis. Pattern Recog., pages 12882–12891, 2022a.
- Fov-nerf: Foveated neural radiance fields for virtual reality. IEEE Trans. Vis. Comput. Graph., 28(11):3854–3864, 2022b.
- Accurate 3d face reconstruction with weakly-supervised learning: From single image to image set. In IEEE Conf. Comput. Vis. Pattern Recog. Workshops, 2019b.
- Gram: Generative radiance manifolds for 3d-aware image generation. In IEEE Conf. Comput. Vis. Pattern Recog., pages 10673–10683, 2022c.
- Equivariant neural rendering. In Int. Conf. Mach. Learn., pages 2761–2770, 2020.
- Ucsf chimerax: Meeting modern challenges in visualization and analysis. Protein Science, 27(1):14–25, 2018.
- Generative adversarial nets. In Adv. Neural Inform. Process. Syst., 2014a.
- Generative adversarial nets. Adv. Neural Inform. Process. Syst., 27, 2014b.
- Auto-embedding generative adversarial networks for high resolution image synthesis. IEEE Trans. Multimedia, 21(11):2726–2737, 2019.
- Ad-nerf: Audio driven neural radiance fields for talking head synthesis. In IEEE Int. Conf. Comput. Vis., pages 5784–5794, 2021.
- Asymmetric joint gans for normalizing face illumination from a single image. IEEE Trans. Multimedia, 22(6):1619–1633, 2019.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Adv. Neural Inform. Process. Syst., pages 6626–6637, 2017.
- Headnerf: A real-time nerf-based parametric head model. In IEEE Conf. Comput. Vis. Pattern Recog., 2022.
- Codenerf: Disentangled neural radiance fields for object categories. In IEEE Int. Conf. Comput. Vis., pages 12949–12958, 2021.
- Peng Wang Jiatao Gu, Lingjie Liu and Christian Theobalt. Stylenerf: A style-based 3d aware generator for high-resolution image synthesis. In Int. Conf. Learn. Represent., 2022.
- Perceptual losses for real-time style transfer and super-resolution. In Eur. Conf. Comput. Vis., pages 694–711, 2016.
- A style-based generator architecture for generative adversarial networks. In IEEE Conf. Comput. Vis. Pattern Recog., pages 4401–4410, 2019.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Self-supervised geometry-aware encoder for style-based 3d gan inversion. In IEEE Conf. Comput. Vis. Pattern Recog., pages 20940–20949, 2023.
- Maskgan: Towards diverse and interactive facial image manipulation. In IEEE Conf. Comput. Vis. Pattern Recog., pages 5549–5558, 2020.
- Cross-stage relation extraction and presentation attack material perception for face anti-spoofing. Neural Networks, page 106275, 2024.
- Learning efficient gans for image translation via differentiable masks and co-attention distillation. IEEE Trans. Multimedia, 2022.
- Vision transformer for nerf-based view synthesis from a single input image. In IEEE Winter Conf. Appl. Comput. Vis., pages 806–815, 2023.
- Which training methods for gans do actually converge? In Int. Conf. Mach. Learn., pages 3481–3490, 2018.
- Nerf: Representing scenes as neural radiance fields for view synthesis. In Eur. Conf. Comput. Vis., pages 405–421, 2020.
- Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784, 2014.
- Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph., 41(4):1–15, 2022.
- Hologan: Unsupervised learning of 3d representations from natural images. In IEEE Int. Conf. Comput. Vis., pages 7588–7597, 2019.
- Giraffe: Representing scenes as compositional generative neural feature fields. In IEEE Conf. Comput. Vis. Pattern Recog., pages 11453–11464, 2021.
- Pytorch: An imperative style, high-performance deep learning library. In Adv. Neural Inform. Process. Syst., 2019.
- Shape, pose, and appearance from a single image via bootstrapped radiance field inversion. In IEEE Conf. Comput. Vis. Pattern Recog., 2023.
- D-nerf: Neural radiance fields for dynamic scenes. In IEEE Conf. Comput. Vis. Pattern Recog., pages 10318–10327, 2021.
- Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE Trans. Pattern Anal. Mach. Intell., 44(3), 2022.
- Sharf: Shape-conditioned radiance fields from a single view. In Int. Conf. Mach. Learn., 2021.
- Pivotal tuning for latent-based editing of real images. ACM Trans. Graph., 2021.
- Graf: Generative radiance fields for 3d-aware image synthesis. In Adv. Neural Inform. Process. Syst., pages 20154–20166, 2020.
- Gina-3d: Learning to generate implicit neural assets in the wild. In IEEE Conf. Comput. Vis. Pattern Recog., pages 4913–4926, 2023.
- Scene representation networks: Continuous 3d-structure-aware neural scene representations. In Adv. Neural Inform. Process. Syst., 2019.
- Agegan++: Face aging and rejuvenation with dual conditional gans. IEEE Trans. Multimedia, 24:791–804, 2021.
- Compressible-composable nerf via rank-residual decomposition. In Adv. Neural Inform. Process. Syst., 2022.
- Real-time radiance fields for single-image portrait view synthesis. In ACM Trans. Graph., 2023.
- Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process., 13(4):600–612, 2004.
- Nerf–: Neural radiance fields without known camera parameters. arXiv preprint arXiv:2102.07064, 2021.
- Aggregated residual transformations for deep neural networks. In IEEE Conf. Comput. Vis. Pattern Recog., pages 1492–1500, 2017.
- SinNeRF: Training neural radiance fields on complex scenes from a single image. In Eur. Conf. Comput. Vis., pages 736–753, 2022.
- S33{}^{3}start_FLOATSUPERSCRIPT 3 end_FLOATSUPERSCRIPT-nerf: Neural reflectance field from shading and shadow under a single viewpoint. In Adv. Neural Inform. Process. Syst., 2022.
- Cross-ray neural radiance fields for novel-view synthesis from unconstrained image collections. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 15901–15911, 2023.
- Hilo: Detailed and robust 3d clothed human reconstruction with high-and low-frequency information of parametric models. arXiv preprint arXiv:2404.04876, 2024.
- Nerfinvertor: High fidelity nerf-gan inversion for single-shot real image animation. In IEEE Conf. Comput. Vis. Pattern Recog., 2023.
- pixelnerf: Neural radiance fields from one or few images. In IEEE Conf. Comput. Vis. Pattern Recog., pages 4578–4587, 2021.
- Make encoder great again in 3d gan inversion through geometry and occlusion-aware encoding. In IEEE Int. Conf. Comput. Vis., pages 2437–2447, 2023.
- Nerf++: Analyzing and improving neural radiance fields. arXiv preprint arXiv:2010.07492, 2020.
- The unreasonable effectiveness of deep features as a perceptual metric. In IEEE Conf. Comput. Vis. Pattern Recog., pages 586–595, 2018.
- Pose-controllable talking face generation by implicitly modularized audio-visual representation. In IEEE Conf. Comput. Vis. Pattern Recog., pages 4176–4186, 2021.
- Zixiong Huang (5 papers)
- Qi Chen (194 papers)
- Libo Sun (16 papers)
- Yifan Yang (578 papers)
- Naizhou Wang (17 papers)
- Mingkui Tan (124 papers)
- Qi Wu (324 papers)