HeadGaS: Real-Time Animatable Head Avatars via 3D Gaussian Splatting (2312.02902v2)
Abstract: 3D head animation has seen major quality and runtime improvements over the last few years, particularly empowered by the advances in differentiable rendering and neural radiance fields. Real-time rendering is a highly desirable goal for real-world applications. We propose HeadGaS, a model that uses 3D Gaussian Splats (3DGS) for 3D head reconstruction and animation. In this paper we introduce a hybrid model that extends the explicit 3DGS representation with a base of learnable latent features, which can be linearly blended with low-dimensional parameters from parametric head models to obtain expression-dependent color and opacity values. We demonstrate that HeadGaS delivers state-of-the-art results in real-time inference frame rates, surpassing baselines by up to 2dB, while accelerating rendering speed by over x10.
- Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
- Mip-nerf 360: Unbounded anti-aliased neural radiance fields. CVPR, 2022.
- Zip-nerf: Anti-aliased grid-based neural radiance fields. ICCV, 2023.
- A morphable model for the synthesis of 3d faces. In Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, page 187–194, USA, 1999. ACM Press/Addison-Wesley Publishing Co.
- Facewarehouse: A 3d facial expression database for visual computing. IEEE Transactions on Visualization and Computer Graphics, 20(3):413–425, 2014.
- Tensorf: Tensorial radiance fields. In European Conference on Computer Vision (ECCV), 2022.
- Animatable neural radiance fields from monocular rgb videos. ArXiv, abs/2106.13629, 2021.
- Depth-supervised NeRF: Fewer views and faster training for free. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Neural radiance flow for 4d view synthesis and video processing. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021.
- Dynamic neural radiance fields for monocular 4d facial avatar reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8649–8658, 2021.
- Reconstructing personalized semantic facial nerf models from monocular video. ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia), 41(6), 2022.
- Automatic face reenactment. 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
- Neural head avatars from monocular rgb videos. CVPR, 2022.
- Headnerf: A real-time nerf-based parametric head model. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Vschh 2023: A benchmark for the view synthesis challenge of human heads. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2023.
- Perceptual losses for real-time style transfer and super-resolution. In European Conference on Computer Vision, 2016.
- 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, 42(4), 2023.
- Deep video portraits. ACM Transactions on Graphics (TOG), 37(4):163, 2018.
- Nersemble: Multi-view radiance field reconstruction of human heads. ACM Trans. Graph., 42(4), 2023.
- Learning a model of facial shape and expression from 4D scans. ACM Transactions on Graphics, (Proc. SIGGRAPH Asia), 36(6), 2017.
- Neural scene flow fields for space-time view synthesis of dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis. arXiv, 2308.09713, 2023.
- KeypointNeRF: Generalizing image-based volumetric avatars using relative spatial encoding of keypoints. In European conference on computer vision, 2022.
- Nerf: Representing scenes as neural radiance fields for view synthesis. In ECCV, 2020.
- Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph., 41(4):102:1–102:15, 2022.
- Regnerf: Regularizing neural radiance fields for view synthesis from sparse inputs. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2022.
- Deepsdf: Learning continuous signed distance functions for shape representation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
- Nerfies: Deformable neural radiance fields. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2021a.
- Hypernerf: A higher-dimensional representation for topologically varying neural radiance fields. ACM Trans. Graph., 40(6), 2021b.
- Animatable neural radiance fields for modeling dynamic human bodies. In ICCV, 2021.
- D-nerf: Neural radiance fields for dynamic scenes. CVPR, 2020.
- Sebastian Ruder. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747, 2016.
- Structure-from-motion revisited. In Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations, 2015.
- Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction. In CVPR, 2022.
- Non-rigid neural radiance fields: Reconstruction and novel view synthesis of a dynamic scene from monocular video. In IEEE International Conference on Computer Vision (ICCV). IEEE, 2021.
- Sparf: Neural radiance fields from sparse and noisy poses. IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2023.
- Morf: Morphable radiance fields for multiview neural head modeling. In SIGGRAPH ’22: Special Interest Group on Computer Graphics and Interactive Techniques Conference. ACM, 2022.
- HumanNeRF: Free-viewpoint rendering of moving people from monocular video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16210–16220, 2022.
- 4d gaussian splatting for real-time dynamic scene rendering. arXiv, 2310.08528, 2023.
- Empirical evaluation of rectified activations in convolutional network. 2015.
- Avatarmav: Fast 3d head avatar reconstruction using motion-aware neural voxels. In ACM SIGGRAPH 2023 Conference Proceedings, 2023.
- Deformable 3d gaussians for high-fidelity monocular dynamic scene reconstruction. arXiv, 2309.13101, 2023.
- Differentiable surface splatting for point-based geometry processing. ACM Transactions on Graphics (proceedings of ACM SIGGRAPH ASIA), 38(6), 2019.
- The unreasonable effectiveness of deep features as a perceptual metric. In CVPR, 2018.
- I M Avatar: Implicit morphable head avatars from videos. In Computer Vision and Pattern Recognition (CVPR), 2022.
- Pointavatar: Deformable point-based head avatars from videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Instant volumetric head avatars. In Conference on Computer Vision and Pattern Recognition, 2023.