EyeBAG: Accurate Control of Eye Blink and Gaze Based on Data Augmentation Leveraging Style Mixing (2306.17391v1)
Abstract: Recent developments in generative models have enabled the generation of photo-realistic human face images, and downstream tasks utilizing face generation technology have advanced accordingly. However, models for downstream tasks are yet substandard at eye control (e.g. eye blink, gaze redirection). To overcome such eye control problems, we introduce a novel framework consisting of two distinct modules: a blink control module and a gaze redirection module. We also propose a novel data augmentation method to train each module, leveraging style mixing to obtain images with desired features. We show that our framework produces eye-controlled images of high quality, and demonstrate how it can be used to improve the performance of downstream tasks.
- Example-based rendering of eye movements. In Computer Graphics Forum, volume 28, pages 659–666. Wiley Online Library, 2009.
- Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 5855–5864, October 2021.
- Neural head reenactment with latent pose descriptors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 13786–13795, 2020.
- Simswap: An efficient framework for high fidelity face swapping. In Proceedings of the 28th ACM International Conference on Multimedia, pages 2003–2011, 2020.
- Gaze manipulation for one-to-one teleconferencing. In Computer Vision, IEEE International Conference on, volume 2, pages 191–191. IEEE Computer Society, 2003.
- Megaportraits: One-shot megapixel neural head avatars. arXiv preprint arXiv:2207.07621, 2022.
- Eye blink detection using variance of motion vectors. In ECCV Workshops (3), pages 436–448, 2014.
- Rt-gene: Real-time eye gaze estimation in natural environments. In Proceedings of the European conference on computer vision (ECCV), pages 334–352, 2018.
- Eye blink completeness detection. Computer Vision and Image Understanding, 176-177:78–85, 2018.
- Deepwarp: Photorealistic image resynthesis for gaze manipulation. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14, pages 311–326. Springer, 2016.
- Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020.
- Photo-realistic monocular gaze redirection using generative adversarial networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6932–6941, 2019.
- Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
- A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4401–4410, 2019.
- Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8110–8119, 2020.
- Subject guided eye image synthesis with application to gaze redirection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 11–20, 2021.
- Deep video portraits. ACM Transactions on Graphics (TOG), 37(4):1–14, 2018.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Photorealistic monocular gaze redirection using machine learning. IEEE transactions on pattern analysis and machine intelligence, 40(11):2696–2710, 2017.
- Learning to look up: Realtime monocular gaze correction using machine learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4667–4675, 2015.
- Maskgan: Towards diverse and interactive facial image manipulation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- Faceshifter: Towards high fidelity and occlusion aware face swapping. arXiv preprint arXiv:1912.13457, 2019.
- Large-scale celebfaces attributes (celeba) dataset. Retrieved August, 15(2018):11, 2018.
- Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
- Data augmentation using generative adversarial networks (gans) for gan-based detection of pneumonia and covid-19 in chest x-ray images. Informatics in Medicine Unlocked, 27:100779, 2021.
- Improved denoising diffusion probabilistic models. In International Conference on Machine Learning, pages 8162–8171. PMLR, 2021.
- Styleclip: Text-driven manipulation of stylegan imagery. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2085–2094, 2021.
- Resolution dependent gan interpolation for controllable image synthesis between domains. arXiv preprint arXiv:2010.05334, 2020.
- D-nerf: Neural radiance fields for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10318–10327, June 2021.
- Faceforensics++: Learning to detect manipulated facial images. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1–11, 2019.
- Interfacegan: Interpreting the disentangled face representation learned by gans. IEEE transactions on pattern analysis and machine intelligence, 44(4):2004–2018, 2020.
- Closed-form factorization of latent semantics in gans. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1532–1540, 2021.
- Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020.
- Real-time neural radiance talking portrait synthesis via audio-spatial decomposition. arXiv preprint arXiv:2211.12368, 2022.
- Hififace: 3d shape and semantic prior guided high fidelity face swapping. arXiv preprint arXiv:2106.09965, 2021.
- Gazedirector: Fully articulated eye gaze redirection in video. In Computer Graphics Forum, volume 37, pages 217–225. Wiley Online Library, 2018.
- High-resolution face swapping via latent semantics disentanglement, 2022.
- Eye gaze correction with stereovision for video-teleconferencing. In Computer Vision—ECCV 2002: 7th European Conference on Computer Vision Copenhagen, Denmark, May 28–31, 2002 Proceedings, Part II 7, pages 479–494. Springer, 2002.
- Dual in-painting model for unsupervised gaze correction and animation in the wild. In Proceedings of the 28th ACM International Conference on Multimedia, pages 1588–1596, 2020.
- Datasetgan: Efficient labeled data factory with minimal human effort. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10145–10155, 2021.
- Im avatar: Implicit morphable head avatars from videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13545–13555, 2022.
- Eye contact in video conference via fusion of time-of-flight depth sensor and stereo. 3D Research, 2(3):1–10, 2011.
- One shot face swapping on megapixels. CoRR, abs/2105.04932, 2021.
- Bryan S. Kim (10 papers)
- Jeong Young Jeong (1 paper)
- Wonjong Ryu (1 paper)