Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
134 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

3D-SSGAN: Lifting 2D Semantics for 3D-Aware Compositional Portrait Synthesis (2401.03764v1)

Published 8 Jan 2024 in cs.CV and cs.GR

Abstract: Existing 3D-aware portrait synthesis methods can generate impressive high-quality images while preserving strong 3D consistency. However, most of them cannot support the fine-grained part-level control over synthesized images. Conversely, some GAN-based 2D portrait synthesis methods can achieve clear disentanglement of facial regions, but they cannot preserve view consistency due to a lack of 3D modeling abilities. To address these issues, we propose 3D-SSGAN, a novel framework for 3D-aware compositional portrait image synthesis. First, a simple yet effective depth-guided 2D-to-3D lifting module maps the generated 2D part features and semantics to 3D. Then, a volume renderer with a novel 3D-aware semantic mask renderer is utilized to produce the composed face features and corresponding masks. The whole framework is trained end-to-end by discriminating between real and synthesized 2D images and their semantic masks. Quantitative and qualitative evaluations demonstrate the superiority of 3D-SSGAN in controllable part-level synthesis while preserving 3D view consistency.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019, p. 4401–4410.
  2. Analyzing and improving the image quality of stylegan. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020, p. 8110–8119.
  3. Alias-free generative adversarial networks. Advances in Neural Information Processing Systems 2021;34:852–863.
  4. State-of-the-art in the architecture, methods and applications of stylegan. Computer Graphics Forum 2022;41(2):591–611.
  5. Semanticstylegan: Learning compositional generative priors for controllable image synthesis and editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, p. 11254–11264.
  6. Stylenerf: A style-based 3d-aware generator for high-resolution image synthesis. arXiv preprint arXiv:211008985 2021;.
  7. 3d-aware image synthesis via learning structural and textural representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022a, p. 18430–18439.
  8. Efficient geometry-aware 3d generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, p. 16123–16133.
  9. Gram: Generative radiance manifolds for 3d-aware image generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, p. 10673–10683.
  10. Stylesdf: High-resolution 3d-consistent image and geometry generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, p. 13503–13513.
  11. Generative adversarial nets. stat 2014;1050:10.
  12. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM 2021;65(1):99–106.
  13. Ide-3d: Interactive disentangled editing for high-resolution 3d-aware portrait synthesis. arXiv preprint arXiv:220515517 2022a;.
  14. Nerffaceediting: Disentangled face editing in neural radiance fields. In: SIGGRAPH Asia 2022 Conference Papers. 2022, p. 1–9.
  15. Semantic 3d-aware portrait synthesis and manipulation based on compositional neural radiance field. In: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI). 2023,.
  16. Advances in neural rendering. Computer Graphics Forum 2022;41(2):703–735.
  17. A survey on 3d-aware image synthesis. 2022. arXiv:2210.14267.
  18. Visual object networks: Image generation with disentangled 3d representations. Advances in neural information processing systems 2018;31.
  19. Escaping plato’s cave: 3d shape from adversarial rendering. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019, p. 9984--9993.
  20. Hologan: Unsupervised learning of 3d representations from natural images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019, p. 7588--7597.
  21. Blockgan: Learning 3d object-aware scene representations from unlabelled images. Advances in Neural Information Processing Systems 2020;33:6767--6778.
  22. pi-gan: Periodic implicit generative adversarial networks for 3d-aware image synthesis. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021, p. 5799--5809.
  23. A shading-guided generative implicit model for shape-accurate 3d-aware image synthesis. Advances in Neural Information Processing Systems 2021;34:20002--20013.
  24. Cips-3d: A 3d-aware generator of gans based on conditionally-independent pixel synthesis. arXiv preprint arXiv:211009788 2021;.
  25. Gram-hd: 3d-consistent image generation at high resolution with generative radiance manifolds. arXiv preprint arXiv:220607255 2022;.
  26. Exploiting spatial dimensions of latent in gan for real-time image editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, p. 852--861.
  27. Diagonal attention and style-based gan for content-style disentanglement in image generation and translation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021, p. 13980--13989.
  28. Transeditor: transformer-based dual-space gan for highly controllable facial editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022b, p. 7683--7692.
  29. Stylefusion: A generative model for disentangling spatial segments. arXiv preprint arXiv:210707437 2021;.
  30. Fenerf: Face editing in neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022b, p. 7672--7682.
  31. Sem2nerf: Converting single-view semantic masks to neural radiance fields. arXiv preprint arXiv:220310821 2022;.
  32. Training and tuning generative neural radiance fields for attribute-conditional 3d-aware face generation. arXiv preprint arXiv:220812550 2022;.
  33. Compositional gan: Learning image-conditional binary composition. International Journal of Computer Vision 2020;128:2570--2585.
  34. Surprising image compositions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, p. 3926--3930.
  35. Sac-gan: Structure-aware image composition. IEEE Transactions on Visualization and Computer Graphics 2022;:1--13.
  36. Attend, infer, repeat: Fast scene understanding with generative models. Advances in neural information processing systems 2016;29.
  37. Lr-gan: Layered recursive generative adversarial networks for image generation. arXiv preprint arXiv:170301560 2017;.
  38. Monet: Unsupervised scene decomposition and representation. arXiv preprint arXiv:190111390 2019;.
  39. Multi-object representation learning with iterative variational inference. In: International Conference on Machine Learning. PMLR; 2019, p. 2424--2433.
  40. Relate: Physically plausible multi-object scene synthesis using structured latent spaces. Advances in Neural Information Processing Systems 2020;33:11202--11213.
  41. Compositional transformers for scene generation. Advances in Neural Information Processing Systems 2021;34:9506--9520.
  42. Giraffe: Representing scenes as compositional generative neural feature fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, p. 11453--11464.
  43. Giraffe hd: A high-resolution 3d-aware generative model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, p. 18440--18449.
  44. Maskgan: Towards diverse and interactive facial image manipulation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, p. 5549--5558.
  45. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems 2017;30.
Citations (3)

Summary

We haven't generated a summary for this paper yet.