VINECS: Video-based Neural Character Skinning (2307.00842v1)
Abstract: Rigging and skinning clothed human avatars is a challenging task and traditionally requires a lot of manual work and expertise. Recent methods addressing it either generalize across different characters or focus on capturing the dynamics of a single character observed under different pose configurations. However, the former methods typically predict solely static skinning weights, which perform poorly for highly articulated poses, and the latter ones either require dense 3D character scans in different poses or cannot generate an explicit mesh with vertex correspondence over time. To address these challenges, we propose a fully automated approach for creating a fully rigged character with pose-dependent skinning weights, which can be solely learned from multi-view video. Therefore, we first acquire a rigged template, which is then statically skinned. Next, a coordinate-based MLP learns a skinning weights field parameterized over the position in a canonical pose space and the respective pose. Moreover, we introduce our pose- and view-dependent appearance field allowing us to differentiably render and supervise the posed mesh using multi-view imagery. We show that our approach outperforms state-of-the-art while not relying on dense 4D scans.
- TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.
- Agisoft. PhotoScan. http://www.agisoft.com, 2016.
- Articulated body deformation from range scan data. ACM Trans. Graph., 21(3):612–619, jul 2002.
- Learning a Correlated Model of Identity and Pose-Dependent Body Shape Variation for Real-Time Synthesis. In Marie-Paule Cani and James O’Brien, editors, ACM SIGGRAPH / Eurographics Symposium on Computer Animation. The Eurographics Association, 2006.
- SCAPE: Shape Completion and Animation of People. ACM Transactions on Graphics, 24(3):408–416, 2005.
- Spline interface for intuitive skinning weight editing. ACM Trans. Graph., 37(5), sep 2018.
- Automatic rigging and animation of 3d characters. ACM Trans. Graph., 26(3), July 2007.
- Snarf: Differentiable forward skinning for animating non-rigid neural implicit shapes. In International Conference on Computer Vision (ICCV), 2021.
- Olivier Dionne and Martin de Lasa. Geodesic voxel binding for production character meshes. In Proceedings of the 12th ACM SIGGRAPH/Eurographics Symposium on Computer Animation, SCA ’13, page 173–180, New York, NY, USA, 2013. Association for Computing Machinery.
- Olivier Dionne and Martin de Lasa. Geodesic binding for degenerate character geometry using sparse voxelization. IEEE Transactions on Visualization and Computer Graphics, 20(10):1367–1378, 2014.
- Deformation styles for spline-based skeletal animation. In Proceedings of the 2007 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, SCA ’07, page 141–150, Goslar, DEU, 2007. Eurographics Association.
- Implicit geometric regularization for learning shapes. In Proceedings of Machine Learning and Systems 2020, pages 3569–3579. 2020.
- Real-time deep dynamic characters. ACM Trans. Graph., 40(4), jul 2021.
- Deepcap: Monocular human performance capture using weak supervision. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 1:1, 2020.
- Livecap: Real-time human performance capture from monocular video. ACM Transactions on Graphics (TOG), 38(2):14:1–14:17, 2019.
- Sweep-based human deformation. The Visual Computer, 21:542–550, 2005.
- Bounded biharmonic weights for real-time deformation. Commun. ACM, 57(4):99–106, apr 2014.
- Skinning mesh animations. ACM Trans. Graph., 24(3):399–407, jul 2005.
- Hifecap: Monocular high-fidelity and expressive capture of human performances. In BMVC, 2022.
- Skinning with dual quaternions. In Proceedings of the 2007 symposium on Interactive 3D graphics and games, pages 39–46. ACM, 2007.
- Elasticity-inspired deformers for character articulation. ACM Trans. Graph., 31(6), nov 2012.
- Spherical blend skinning: A real-time deformation of articulated models. In Proceedings of the 2005 Symposium on Interactive 3D Graphics and Games, I3D ’05, page 9–16, New York, NY, USA, 2005. Association for Computing Machinery.
- Adam: A method for stochastic optimization. International Conference on Learning Representations, 12 2014.
- Robust and accurate skeletal rigging from mesh sequences. ACM Trans. Graph., 33(4), jul 2014.
- Real-time skeletal skinning with optimized centers of rotation. ACM Trans. Graph., 35(4), jul 2016.
- Pose space deformation: A unified approach to shape interpolation and skeleton-driven deformation. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’00, page 165–172, USA, 2000. ACM Press/Addison-Wesley Publishing Co.
- Learning skeletal articulations with neural blend shapes. ACM Transactions on Graphics (TOG), 40(4):1, 2021.
- Self-correction for human parsing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.
- Tava: Template-free animatable volumetric actors. 2022.
- Deep physics-aware inference of cloth deformation for monocular human performance capture. 2020.
- Skeleton-free pose transfer for stylized 3d characters. In European Conference on Computer Vision (ECCV). Springer, October 2022.
- Neural actor: Neural free-view synthesis of human actors with pose control. ACM Trans. Graph., 40(6), dec 2021.
- Neuroskinning: Automatic skin binding for production characters with deep graph networks. ACM Trans. Graph., 38(4), jul 2019.
- SMPL: A skinned multi-person linear model. ACM Trans. Graphics (Proc. SIGGRAPH Asia), 34(6):248:1–248:16, Oct. 2015.
- Marching cubes: A high resolution 3d surface construction algorithm. ACM SIGGRAPH Computer Graphics, 21:163–, 08 1987.
- Joint-dependent local deformations for hand animation and object grasping. In Proceedings of Graphics Interface ’88, GI ’88, pages 26–33. Canadian Man-Computer Communications Society, 1988.
- Nerf: Representing scenes as neural radiance fields for view synthesis. In Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm, editors, Computer Vision – ECCV 2020, pages 405–421, Cham, 2020. Springer International Publishing.
- Building efficient, accurate character skins from examples. ACM Trans. Graph., 22(3):562–568, jul 2003.
- Skinningnet: Two-stream graph convolutional neural network for skinning prediction of synthetic characters. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 18572–18581, Los Alamitos, CA, USA, jun 2022. IEEE Computer Society.
- Efficient dynamic skinning with low-rank helper bone controllers. ACM Trans. Graph., 35(4), jul 2016.
- Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph., 41(4):102:1–102:15, July 2022.
- Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
- Capturing and animating skin deformation in human motion. ACM Trans. Graph., 25(3):881–889, jul 2006.
- Animatable neural radiance fields for human body modeling. ICCV, 2021.
- Neural body: Implicit neural representations with structured latent codes for novel view synthesis of dynamic humans. CVPR, 1(1):9054–9063, 2021.
- SCANimate: Weakly supervised learning of skinned clothed avatar networks. In Proceedings IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), June 2021.
- Weight normalization: A simple reparameterization to accelerate training of deep neural networks. Advances in neural information processing systems, 29, 2016.
- Background matting: The world is your green screen. In Computer Vision and Pattern Regognition (CVPR), 2020.
- Shape by example. In Proceedings of the 2001 Symposium on Interactive 3D Graphics, I3D ’01, pages 135–143, New York, NY, USA, 2001. Association for Computing Machinery.
- TheCaptury. The Captury. http://www.thecaptury.com/, 2020.
- SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261–272, 2020.
- Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. NeurIPS, 2021.
- Arah: Animatable volume rendering of articulated human sdfs. In Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXII, page 1–19, Berlin, Heidelberg, 2022. Springer-Verlag.
- Multi-weight enveloping: Least-squares approximation techniques for skin animation. In Proceedings of the 2002 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, SCA ’02, page 129–138, New York, NY, USA, 2002. Association for Computing Machinery.
- Bone glow: An improved method for the assignment of weights for mesh deformation. In Francisco J. Perales and Robert B. Fisher, editors, Articulated Motion and Deformable Objects, pages 63–71, Berlin, Heidelberg, 2008. Springer Berlin Heidelberg.
- Rignet: Neural rigging for articulated characters. ACM Trans. on Graphics, 39, 2020.
- Object wake-up: 3d object rigging from a single image. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part II, pages 311–327. Springer, 2022.
- Curve skeleton skinning for human and creature characters. Computer Animation and Virtual Worlds, 17, 2006.