Class-Continuous Conditional Generative Neural Radiance Field (2301.00950v3)
Abstract: The 3D-aware image synthesis focuses on conserving spatial consistency besides generating high-resolution images with fine details. Recently, Neural Radiance Field (NeRF) has been introduced for synthesizing novel views with low computational cost and superior performance. While several works investigate a generative NeRF and show remarkable achievement, they cannot handle conditional and continuous feature manipulation in the generation procedure. In this work, we introduce a novel model, called Class-Continuous Conditional Generative NeRF ($\text{C}{3}$G-NeRF), which can synthesize conditionally manipulated photorealistic 3D-consistent images by projecting conditional features to the generator and the discriminator. The proposed $\text{C}{3}$G-NeRF is evaluated with three image datasets, AFHQ, CelebA, and Cars. As a result, our model shows strong 3D-consistency with fine details and smooth interpolation in conditional feature manipulation. For instance, $\text{C}{3}$G-NeRF exhibits a Fr\'echet Inception Distance (FID) of 7.64 in 3D-aware face image synthesis with a $\text{128}{2}$ resolution. Additionally, we provide FIDs of generated 3D-aware images of each class of the datasets as it is possible to synthesize class-conditional images with $\text{C}{3}$G-NeRF.
- Yousef Ashrafi. Iran cars, Aug 2022. URL https://www.kaggle.com/datasets/usefashrfi/iran-used-cars-dataset.
- Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1798–1828, 2013.
- Efficient geometry-aware 3D generative adversarial networks. In arXiv, 2021a.
- pi-gan: Periodic implicit generative adversarial networks for 3d-aware image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5799–5809, 2021b.
- Infogan: Interpretable representation learning by information maximizing generative adversarial nets. Advances in Neural Information Processing Systems, 29:2172–2180, 2016.
- Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE Conference on Computer vision and Pattern Recognition, pages 8789–8797, 2018.
- Stargan v2: Diverse image synthesis for multiple domains. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8188–8197, 2020.
- Accurate 3d face reconstruction with weakly-supervised learning: From single image to image set. In IEEE Computer Vision and Pattern Recognition Workshops, 2019.
- Gram: Generative radiance manifolds for 3d-aware image generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10673–10683, 2022.
- Volume rendering. ACM Siggraph Computer Graphics, 22(4):65–74, 1988.
- Generative adversarial nets. Advances in Neural Information Processing Systems, 27:2672–2680, 2014.
- Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016a.
- Identity mappings in deep residual networks. In European Conference on Computer Vision, pages 630–645. Springer, 2016b.
- Escaping plato’s cave: 3d shape from adversarial rendering. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9984–9993, 2019.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in Neural Information Processing Systems, 30:6626–6637, 2017.
- Cg-nerf: Conditional generative neural radiance fields. arXiv preprint arXiv:2112.03517, 2021.
- Ray tracing volume densities. ACM SIGGRAPH Computer Graphics, 18(3):165–174, 1984.
- Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196, 2017.
- Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and pattern Recognition, pages 8110–8119, 2020.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
- Deep learning. Nature, 521(7553):436–444, 2015.
- Taehee Brad Lee. Cat_hipsterizer, Oct 2018. URL https://github.com/kairess/cat_hipsterizer.
- Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision, pages 3730–3738, 2015.
- Challenging common assumptions in the unsupervised learning of disentangled representations. In International Conference on Machine Learning, pages 4114–4124. PMLR, 2019.
- Which training methods for gans do actually converge? In International Conference on Machine Learning, pages 3481–3490. PMLR, 2018.
- Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
- Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784, 2014.
- cgans with projection discriminator. arXiv preprint arXiv:1802.05637, 2018.
- Rectified linear units improve restricted boltzmann machines. In International Conference on Machine Learning, pages 807–814, 2010.
- Hologan: Unsupervised learning of 3d representations from natural images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7588–7597, 2019.
- Blockgan: Learning 3d object-aware scene representations from unlabelled images. Advances in Neural Information Processing Systems, 33:6767–6778, 2020.
- Giraffe: Representing scenes as compositional generative neural feature fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11453–11464, 2021.
- Conditional image synthesis with auxiliary classifier gans. In International Conference on Machine Learning, pages 2642–2651. PMLR, 2017.
- Learning transferable visual models from natural language supervision. In International Conference on Machine Learning, pages 8748–8763. PMLR, 2021.
- On the spectral bias of neural networks. In International Conference on Machine Learning, pages 5301–5310. PMLR, 2019.
- Pix2shape: Towards unsupervised learning of 3d scenes from images using a view-based representation. International Journal of Computer Vision, 128(10):2478–2493, 2020.
- Searching for activation functions. arXiv preprint arXiv:1710.05941, 2017.
- Sebastian Ruder. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747, 2016.
- Graf: Generative radiance fields for 3d-aware image synthesis. Advances in Neural Information Processing Systems, 33:20154–20166, 2020.
- Scene representation networks: Continuous 3d-structure-aware neural scene representations. Advances in Neural Information Processing Systems, 32:1121–1132, 2019.
- Fourier features let networks learn high frequency functions in low dimensional domains. Advances in Neural Information Processing Systems, 33:7537–7547, 2020.
- Implicit mesh reconstruction from unannotated image collections. arXiv preprint arXiv:2007.08504, 2020.
- Pixel recurrent neural networks. In International Conference on Machine Learning, pages 1747–1756. PMLR, 2016.
- Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. Advances in Neural Information Processing Systems, 29:82–90, 2016.
- Jiwook Kim (5 papers)
- Minhyeok Lee (47 papers)