Face Identity-Aware Disentanglement in StyleGAN (2309.12033v1)
Abstract: Conditional GANs are frequently used for manipulating the attributes of face images, such as expression, hairstyle, pose, or age. Even though the state-of-the-art models successfully modify the requested attributes, they simultaneously modify other important characteristics of the image, such as a person's identity. In this paper, we focus on solving this problem by introducing PluGeN4Faces, a plugin to StyleGAN, which explicitly disentangles face attributes from a person's identity. Our key idea is to perform training on images retrieved from movie frames, where a given person appears in various poses and with different attributes. By applying a type of contrastive loss, we encourage the model to group images of the same person in similar regions of latent space. Our experiments demonstrate that the modifications of face attributes performed by PluGeN4Faces are significantly less invasive on the remaining characteristics of the image than in the existing state-of-the-art models.
- Image2stylegan: How to embed images into the stylegan latent space? In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4432–4441, 2019.
- Image2stylegan++: How to edit the embedded images?, 2020.
- Styleflow: Attribute-conditioned exploration of stylegan-generated images using conditional continuous normalizing flows. ACM Transactions on Graphics (TOG), 40(3):1–21, 2021.
- Stargan v2: Diverse image synthesis for multiple domains. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8188–8197, 2020.
- Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4690–4699, 2019.
- Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516, 2014.
- Nice: Non-linear independent components estimation, 2015.
- High-fidelity and arbitrary face editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16115–16124, 2021.
- Ffjord: Free-form continuous dynamics for scalable reversible generative models, 2018.
- Ganspace: Discovering interpretable gan controls. arXiv preprint arXiv:2004.02546, 2020.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Attgan: Facial attribute editing by only changing what you want. IEEE Transactions on Image Processing, 28(11):5464–5478, 2019.
- On loss functions for deep neural networks in classification. Schedae Informaticae, 25:49–59, 2016.
- Alias-free generative adversarial networks. Advances in Neural Information Processing Systems, 34:852–863, 2021.
- A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4401–4410, 2019.
- Analyzing and improving the image quality of StyleGAN. In Proc. CVPR, 2020.
- Semi-supervised learning with deep generative models. arXiv preprint arXiv:1406.5298, 2014.
- Learning latent subspaces in variational autoencoders. arXiv preprint arXiv:1812.06190, 2018.
- On convergence and stability of gans. arXiv preprint arXiv:1705.07215, 2017.
- Fader networks: Manipulating images by sliding attributes. arXiv preprint arXiv:1706.00409, 2017.
- Deep regression tracking with shrinkage loss. In Proceedings of the European conference on computer vision (ECCV), pages 353–369, 2018.
- Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784, 2014.
- Disentangling in latent space by harnessing a pretrained generator. arXiv preprint arXiv:2005.07728, 2(3), 2020.
- Semantic image synthesis with spatially-adaptive normalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2337–2346, 2019.
- Invertible conditional gans for image editing. arXiv preprint arXiv:1611.06355, 2016.
- Interfacegan: Interpreting the disentangled face representation learned by gans. IEEE transactions on pattern analysis and machine intelligence, 2020.
- Spherical wards clustering and generalized voronoi diagrams. In 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pages 1–10. IEEE, 2015.
- Learning structured output representation using deep conditional generative models. Advances in neural information processing systems, 28:3483–3491, 2015.
- Pie: Portrait image embedding for semantic control. ACM Transactions on Graphics (TOG), 39(6):1–14, 2020.
- Designing an encoder for stylegan image manipulation. ACM Transactions on Graphics (TOG), 40(4):1–14, 2021.
- Hijack-gan: Unintended-use of pretrained, black-box gans. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7872–7881, 2021.
- Plugen: Multi-label conditional generation from pre-trained models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 8647–8656, 2022.
- Attribute2image: Conditional image generation from visual attributes. In European Conference on Computer Vision, pages 776–791. Springer, 2016.
- In-domain gan inversion for real image editing, 2020.
- Improved stylegan embedding: Where are the good latents? arXiv preprint arXiv:2012.09036, 2020.
- Improved stylegan embedding: Where are the good latents?, 2021.