Real-time 3D-aware Portrait Video Relighting (2410.18355v1)
Abstract: Synthesizing realistic videos of talking faces under custom lighting conditions and viewing angles benefits various downstream applications like video conferencing. However, most existing relighting methods are either time-consuming or unable to adjust the viewpoints. In this paper, we present the first real-time 3D-aware method for relighting in-the-wild videos of talking faces based on Neural Radiance Fields (NeRF). Given an input portrait video, our method can synthesize talking faces under both novel views and novel lighting conditions with a photo-realistic and disentangled 3D representation. Specifically, we infer an albedo tri-plane, as well as a shading tri-plane based on a desired lighting condition for each video frame with fast dual-encoders. We also leverage a temporal consistency network to ensure smooth transitions and reduce flickering artifacts. Our method runs at 32.98 fps on consumer-level hardware and achieves state-of-the-art results in terms of reconstruction quality, lighting error, lighting instability, temporal consistency and inference speed. We demonstrate the effectiveness and interactivity of our method on various portrait videos with diverse lighting and viewing conditions.
- PanoHead: Geometry-aware 3D full-head synthesis in 360deg. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20950–20959, 2023.
- High-fidelity facial avatar reconstruction from monocular video with generative priors. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4541–4551, 2023.
- TriPlaneNet: An encoder for EG3D inversion. IEEE/CVF Winter Conference on Applications of Computer Vision, 2024.
- A morphable model for the synthesis of 3D faces. In Proceedings of ACM SIGGRAPH, pages 187–194, 1999.
- Efficient geometry-aware 3D generative adversarial networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16123–16133, 2022.
- Temporally consistent relighting for portrait videos. In IEEE/CVF Winter Conference on Applications of Computer Vision, pages 719–728, 2022.
- Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587, 2017.
- ImageNet: A large-scale hierarchical image database. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009.
- ArcFace: Additive angular margin loss for deep face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):5962–5979, 2022a.
- GRAM: Generative radiance manifolds for 3D-aware image generation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022b.
- Image quality assessment: Unifying structure and texture similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(5):2567–2581, 2022.
- An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations, 2021.
- SPLiT: Single portrait lighting estimation via a tetrad of face intrinsics. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(02):1079–1092, 2024.
- Learning an animatable detailed 3D face model from in-the-wild images. ACM Transactions on Graphics, 40(8), 2021.
- VIVE3D: Viewpoint-independent video editing using 3D-aware GANs. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
- Dynamic neural radiance fields for monocular 4D facial avatar reconstruction. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8649–8658, 2021.
- Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020.
- Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
- Towards high fidelity face relighting with realistic shadows. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021.
- NeRFFaceLighting: Implicit and disentangled face lighting representation leveraging generative prior in neural radiance fields. ACM Transactions on Graphics, 42(3), 2023.
- A style-based generator architecture for generative adversarial networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4401–4410, 2019.
- Analyzing and improving the image quality of StyleGAN. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8110–8119, 2020.
- Adam: A method for stochastic optimization. In International Conference on Learning Representations, 2015.
- Learning blind video temporal consistency. In European Conference on Computer Vision, 2018.
- Learning a model of facial shape and expression from 4D scans. ACM Transactions on Graphics, 36(6):194:1–194:17, 2017.
- MagFace: A universal representation for face recognition and quality assessment. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021.
- NeRF: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
- DiffRF: Rendering-guided 3D radiance field diffusion. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4328–4338, 2023.
- GIRAFFE: Representing scenes as compositional generative neural feature fields. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021.
- A shading-guided generative implicit model for shape-accurate 3D-aware image synthesis. In Advances in Neural Information Processing Systems, 2021.
- Total Relighting: Learning to relight portraits for background replacement. ACM Transactions on Graphics, 40(4):1–21, 2021.
- Relightify: Relightable 3D faces from a single image via diffusion models. In International Conference on Computer Vision, 2023.
- ReliTalk: Relightable talking portrait generation from a single video. In International Journal of Computer Vision, 2024.
- An efficient representation for irradiance environment maps. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, page 497–500, 2001.
- FaceLit: Neural 3D relightable faces. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
- Pivotal tuning for latent-based editing of real images. ACM Transactions on Graphics, 42(1):1–13, 2022.
- GRAF: Generative radiance fields for 3D-aware image synthesis. In Advances in Neural Information Processing Systems, 2020.
- IDE-3D: Interactive disentangled editing for high-resolution 3D-aware portrait synthesis. ACM Transactions on Graphics, 41(6):1–10, 2022.
- Next3D: Generative neural texture rasterization for 3D-aware head avatars. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
- Recent advances in implicit representation based 3D shape generation. Visual Intelligence, 2, 2024.
- RAFT: Recurrent all-pairs field transforms for optical flow. In European Conference on Computer Vision, pages 402–419, 2020.
- MoFA: Model-based deep convolutional face autoencoder for unsupervised monocular reconstruction. In International Conference on Computer Vision Workshops, pages 1274–1283, 2017.
- Designing an encoder for StyleGAN image manipulation. ACM Transactions on Graphics, 40(4), 2021.
- Real-time radiance fields for single-image portrait view synthesis. In ACM Transactions on Graphics, 2023.
- RODIN: A generative model for sculpting 3D digital avatars using diffusion. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4563–4573, 2023.
- Single image portrait relighting via explicit multiple reflectance channel modeling. ACM Transactions on Graphics, 39(6):1–13, 2020.
- High-fidelity 3D GAN inversion by pseudo-multi-view optimization. 2023.
- PV3D: A 3D generative model for portrait video generation. In International Conference on Learning Representations, 2023.
- Giraffe hd: A high-resolution 3d-aware generative model. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
- Learning to relight portrait images via a virtual light stage and synthetic-to-real adaptation. ACM Transactions on Graphics, 2022.
- 3D GAN inversion with facial symmetry prior. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023a.
- NeRFInvertor: High fidelity NeRF-GAN inversion for single-shot real image animation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8539–8548, 2023b.
- Make encoder great again in 3D GAN inversion through geometry and occlusion-aware encoding. In International Conference on Computer Vision, 2023.
- Neural video portrait relighting in real-time via consistency modeling. In International Conference on Computer Vision, pages 802–812, 2021.
- The unreasonable effectiveness of deep features as a perceptual metric. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.
- Deep single portrait image relighting. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019a.
- Deep single-image portrait relighting. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7194–7202, 2019b.
- Instant volumetric head avatars. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4574–4584, 2023.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.