UniFaceGAN: A Unified Framework for Temporally Consistent Facial Video Editing (2108.05650v1)
Abstract: Recent research has witnessed advances in facial image editing tasks including face swapping and face reenactment. However, these methods are confined to dealing with one specific task at a time. In addition, for video facial editing, previous methods either simply apply transformations frame by frame or utilize multiple frames in a concatenated or iterative fashion, which leads to noticeable visual flickers. In this paper, we propose a unified temporally consistent facial video editing framework termed UniFaceGAN. Based on a 3D reconstruction model and a simple yet efficient dynamic training sample selection mechanism, our framework is designed to handle face swapping and face reenactment simultaneously. To enforce the temporal consistency, a novel 3D temporal loss constraint is introduced based on the barycentric coordinate interpolation. Besides, we propose a region-aware conditional normalization layer to replace the traditional AdaIN or SPADE to synthesize more context-harmonious results. Compared with the state-of-the-art facial image editing methods, our framework generates video portraits that are more photo-realistic and temporally smooth.
- Meng Cao (107 papers)
- Haozhi Huang (15 papers)
- Hao Wang (1124 papers)
- Xuan Wang (205 papers)
- Li Shen (363 papers)
- Sheng Wang (239 papers)
- Linchao Bao (43 papers)
- Zhifeng Li (74 papers)
- Jiebo Luo (355 papers)