Towards Geometric-Photometric Joint Alignment for Facial Mesh Registration
Abstract: This paper presents a Geometric-Photometric Joint Alignment~(GPJA) method, which aligns discrete human expressions at pixel-level accuracy by combining geometric and photometric information. Common practices for registering human heads typically involve aligning landmarks with facial template meshes using geometry processing approaches, but often overlook dense pixel-level photometric consistency. This oversight leads to inconsistent texture parametrization across different expressions, hindering the creation of topologically consistent head meshes widely used in movies and games. GPJA overcomes this limitation by leveraging differentiable rendering to align vertices with target expressions, achieving joint alignment in both geometry and photometric appearances automatically, without requiring semantic annotation or pre-aligned meshes for training. It features a holistic rendering alignment mechanism and a multiscale regularized optimization for robust convergence on large deformation. The method utilizes derivatives at vertex positions for supervision and employs a gradient-based algorithm which guarantees smoothness and avoids topological artifacts during the geometry evolution. Experimental results demonstrate faithful alignment under various expressions, surpassing the conventional non-rigid ICP-based methods and the state-of-the-art deep learning based method. In practical, our method generates meshes of the same subject across diverse expressions, all with the same texture parametrization. This consistency benefits face animation, re-parametrization, and other batch operations for face modeling and applications with enhanced efficiency.
- Optimal step nonrigid ICP algorithms for surface registration. In 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 18-23 June 2007, Minneapolis, Minnesota, USA. IEEE Computer Society, 2007.
- Rignerf: Fully controllable neural 3d portraits. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, pages 20332–20341. IEEE, 2022.
- Möbius registration. Comput. Graph. Forum, 37(5):211–220, 2018.
- High-quality passive facial performance capture using anchor frames. ACM Trans. Graph., 30(4):75, 2011.
- Detailed spatio-temporal reconstruction of eyelids. ACM Trans. Graph., 34(4):44:1–44:11, 2015.
- Instant multi-view head capture through learnable registration. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023, pages 768–779. IEEE, 2023.
- A remeshing approach to multiresolution modeling. In Second Eurographics Symposium on Geometry Processing, Nice, France, July 8-10, 2004, pages 185–192. Eurographics Association, 2004.
- Real-time high-fidelity facial performance capture. ACM Trans. Graph., 34(4):46:1–46:9, 2015.
- Statistical modeling of craniofacial shape and texture. Int. J. Comput. Vis., 128(2):547–571, 2020.
- A survey of non-rigid 3d registration. Comput. Graph. Forum, 41(2):559–589, 2022.
- Implicit fairing of irregular meshes using diffusion and curvature flow. In Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 1999, Los Angeles, CA, USA, August 8-13, 1999, pages 317–324. ACM, 1999.
- User-guided lip correction for facial performance capture. Comput. Graph. Forum, 37(8):93–101, 2018.
- Multi-view stereo on consistent face topology. Comput. Graph. Forum, 36(2):295–309, 2017.
- Dynamic neural radiance fields for monocular 4d facial avatar reconstruction. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021, pages 8649–8658. Computer Vision Foundation / IEEE, 2021.
- Corrective 3d reconstruction of lips from monocular video. ACM Trans. Graph., 35(6):219:1–219:11, 2016.
- Learning neural parametric head models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023, pages 21003–21012. IEEE, 2023.
- Practical dynamic facial appearance modeling and acquisition. ACM Trans. Graph., 37(6):232, 2018.
- 6.2.3 depth of points. In Multiple view geometry in computer vision (2. ed.). Cambridge University Press, 2006.
- Mesh density adaptation for template-based shape reconstruction. In ACM SIGGRAPH 2023 Conference Proceedings, SIGGRAPH 2023, Los Angeles, CA, USA, August 6-10, 2023, pages 53:1–53:10. ACM, 2023.
- Modular primitives for high-performance differentiable rendering. ACM Trans. Graph., 39(6):194:1–194:14, 2020.
- Dense point-to-point correspondences between genus-zero shapes. Comput. Graph. Forum, 38(5):27–37, 2019.
- Global correspondence optimization for non-rigid registration of depth scans. Comput. Graph. Forum, 27(5):1421–1430, 2008.
- Topologically consistent multi-view face inference using volumetric sampling. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, pages 3804–3814. IEEE, 2021.
- Seamless: seam erasure and seam-aware decoupling of shape from mesh resolution. ACM Trans. Graph., 36(6):216:1–216:15, 2017.
- Rapid face asset acquisition with recurrent feature alignment. ACM Trans. Graph., 41(6):214:1–214:17, 2022.
- Neural parameterization for dynamic human head editing. ACM Trans. Graph., 41(6):236:1–236:15, 2022.
- Nerf: Representing scenes as neural radiance fields for view synthesis. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part I, pages 405–421. Springer, 2020.
- Extracting triangular 3d models, materials, and lighting from images. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, pages 8270–8280. IEEE, 2022.
- Hubert Nguyen. Chapter 22. baking normal maps on the gpu. In Gpu Gems 3. Addison-Wesley Professional, 2007.
- Large steps in inverse rendering of geometry. ACM Trans. Graph., 40(6):248:1–248:13, 2021.
- Mitsuba 2: a retargetable forward and inverse renderer. ACM Trans. Graph., 38(6):203:1–203:17, 2019.
- Laplacian ICP for progressive registration of 3d human head meshes. In 17th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2023, Waikoloa Beach, HI, USA, January 5-8, 2023, pages 1–7. IEEE, 2023.
- Motion graphs for unstructured textured meshes. ACM Trans. Graph., 35(4):108:1–108:14, 2016.
- D-nerf: Neural radiance fields for dynamic scenes. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021, pages 10318–10327. Computer Vision Foundation / IEEE, 2021.
- An efficient representation for irradiance environment maps. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2001, Los Angeles, California, USA, August 12-17, 2001, pages 497–500. ACM, 2001.
- Single-shot high-quality facial geometry and skin appearance capture. ACM Trans. Graph., 39(4):81, 2020.
- Efficient variants of the ICP algorithm. In 3rd International Conference on 3D Digital Imaging and Modeling (3DIM 2001), 28 May - 1 June 2001, Quebec City, Canada, pages 145–152. IEEE Computer Society, 2001.
- Tinyad: Automatic differentiation in geometry processing made simple. Comput. Graph. Forum, 41(5):113–124, 2022.
- As-rigid-as-possible surface modeling. In Proceedings of the Fifth Eurographics Symposium on Geometry Processing, Barcelona, Spain, July 4-6, 2007, pages 109–116. Eurographics Association, 2007.
- Laplacian surface editing. In Second Eurographics Symposium on Geometry Processing, Nice, France, July 8-10, 2004, pages 175–184. Eurographics Association, 2004.
- Registration of 3d point clouds and meshes: A survey from rigid to nonrigid. IEEE Trans. Vis. Comput. Graph., 19(7):1199–1217, 2013.
- An anatomically-constrained local deformation model for monocular face capture. ACM Trans. Graph., 35(4):115:1–115:12, 2016.
- Facescape: A large-scale high quality 3d face dataset and detailed riggable 3d face prediction. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pages 598–607. Computer Vision Foundation / IEEE, 2020.
- Video-driven neural physically-based facial asset for production. ACM Trans. Graph., 41(6):208:1–208:16, 2022.
- I M avatar: Implicit morphable head avatars from videos. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, pages 13535–13545. IEEE, 2022.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.