Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Neural-ABC: Neural Parametric Models for Articulated Body with Clothes (2404.04673v1)

Published 6 Apr 2024 in cs.CV and cs.GR

Abstract: In this paper, we introduce Neural-ABC, a novel parametric model based on neural implicit functions that can represent clothed human bodies with disentangled latent spaces for identity, clothing, shape, and pose. Traditional mesh-based representations struggle to represent articulated bodies with clothes due to the diversity of human body shapes and clothing styles, as well as the complexity of poses. Our proposed model provides a unified framework for parametric modeling, which can represent the identity, clothing, shape and pose of the clothed human body. Our proposed approach utilizes the power of neural implicit functions as the underlying representation and integrates well-designed structures to meet the necessary requirements. Specifically, we represent the underlying body as a signed distance function and clothing as an unsigned distance function, and they can be uniformly represented as unsigned distance fields. Different types of clothing do not require predefined topological structures or classifications, and can follow changes in the underlying body to fit the body. Additionally, we construct poses using a controllable articulated structure. The model is trained on both open and newly constructed datasets, and our decoupling strategy is carefully designed to ensure optimal performance. Our model excels at disentangling clothing and identity in different shape and poses while preserving the style of the clothing. We demonstrate that Neural-ABC fits new observations of different types of clothing. Compared to other state-of-the-art parametric models, Neural-ABC demonstrates powerful advantages in the reconstruction of clothed human bodies, as evidenced by fitting raw scans, depth maps and images. We show that the attributes of the fitted results can be further edited by adjusting their identities, clothing, shape and pose codes.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (77)
  1. A. Kanazawa, M. J. Black, D. W. Jacobs, and J. Malik, “End-to-end recovery of human shape and pose,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
  2. R. Plankers and P. Fua, “Articulated soft objects for video-based body modeling,” in Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, vol. 1.   IEEE, 2001, pp. 394–401.
  3. C. Sminchisescu, A. Kanaujia, and D. Metaxas, “Learning joint top-down and bottom-up processes for 3d visual inference,” in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2.   IEEE, 2006, pp. 1743–1752.
  4. Z. Li, J. Liu, Z. Zhang, S. Xu, and Y. Yan, “Cliff: Carrying location information in full frames into human pose and shape estimation,” in ECCV, 2022.
  5. M. Petrovich, M. J. Black, and G. Varol, “Action-conditioned 3D human motion synthesis with transformer VAE,” in International Conference on Computer Vision (ICCV), 2021.
  6. C. Sminchisescu and A. Telea, “Human pose estimation from silhouettes. a consistent approach using distance level sets,” in International Conference on Computer Graphics, Visualization and Computer Vision (WSCG’02), vol. 10, 2002.
  7. Z. Li, Z. Zheng, H. Zhang, C. Ji, and Y. Liu, “Avatarcap: Animatable avatar conditioned monocular human volumetric capture,” in European Conference on Computer Vision (ECCV), October 2022.
  8. T. Zhi, C. Lassner, T. Tung, C. Stoll, S. G. Narasimhan, and M. Vo, “Texmesh: Reconstructing detailed human texture and geometry from rgb-d video,” in European Conference on Computer Vision (ECCV).   Springer, 2020, pp. 492–509.
  9. S. Saito, J. Yang, Q. Ma, and M. J. Black, “Scanimate: Weakly supervised learning of skinned clothed avatar networks,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 2886–2897.
  10. S. Wang, M. Mihajlovic, Q. Ma, A. Geiger, and S. Tang, “Metaavatar: Learning animatable clothed human models from few depth images,” in Advances in Neural Information Processing Systems, 2021.
  11. X. Chen, T. Jiang, J. Song, J. Yang, M. J. Black, A. Geiger, and O. Hilliges, “gdna: Towards generative detailed neural avatars,” in IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2022.
  12. M. Zanfir, E. Oneata, A.-I. Popa, A. Zanfir, and C. Sminchisescu, “Human synthesis and scene compositing,” in AAAI Conference on Artificial Intelligence, vol. 34, no. 07, 2020, pp. 12 749–12 756.
  13. D. Anguelov, P. Srinivasan, D. Koller, S. Thrun, J. Rodgers, and J. Davis, “Scape: shape completion and animation of people,” ACM Transactions on Graphics (TOG), vol. 24, pp. 408–416, 2005.
  14. M. Loper, N. Mahmood, J. Romero, G. Pons-Moll, and M. J. Black, “Smpl: A skinned multi-person linear model,” ACM transactions on graphics (TOG), vol. 34, no. 6, pp. 1–16, 2015.
  15. B. Jiang, J. Zhang, J. Cai, and J. Zheng, “Disentangled human body embedding based on deep hierarchical neural network,” IEEE transactions on visualization and computer graphics, vol. 26, no. 8, pp. 2560–2575, 2020.
  16. J. Romero, D. Tzionas, and M. J. Black, “Embodied hands: Modeling and capturing hands and bodies together,” ACM Transactions on Graphics (TOG), vol. 36, no. 6, pp. 1–17, 2017.
  17. B. Jiang, J. Zhang, Y. Hong, J. Luo, L. Liu, and H. Bao, “Bcnet: Learning body and cloth shape from a single image,” in European Conference on Computer Vision (ECCV).   Springer, 2020, pp. 18–35.
  18. E. Corona, A. Pumarola, G. Alenya, G. Pons-Moll, and F. Moreno-Noguer, “Smplicit: Topology-aware generative model for clothed people,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 11 875–11 885.
  19. L. Ren, B. Guillard, E. Remelli, and P. Fua, “DIG: Draping Implicit Garment over the Human Body,” in Asian Conference on Computer Vision, 2022.
  20. L. De Luigi, R. Li, B. Guillard, M. Salzmann, and P. Fua, “DrapeNet: Garment Generation and Self-Supervised Draping,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
  21. G. Pons-Moll, S. Pujades, S. Hu, and M. J. Black, “Clothcap: Seamless 4d clothing capture and retargeting,” ACM Transactions on Graphics (TOG), vol. 36, no. 4, pp. 1–15, 2017.
  22. C. Zhang, S. Pujades, M. J. Black, and G. Pons-Moll, “Detailed, accurate, human shape estimation from clothed 3d scan sequences,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 4191–4200.
  23. T. Alldieck, M. Magnor, W. Xu, C. Theobalt, and G. Pons-Moll, “Video based reconstruction of 3d people models,” in IEEE Conference on Computer Vision and Pattern Recognition, 2018.
  24. T. Alldieck, G. Pons-Moll, C. Theobalt, and M. Magnor, “Tex2shape: Detailed full human body geometry from a single image,” International Conference on Computer Vision, 2019.
  25. T. Alldieck, M. Magnor, B. L. Bhatnagar, C. Theobalt, and G. Pons-Moll, “Learning to reconstruct people in clothing from a single rgb camera,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 1175–1186.
  26. B. L. Bhatnagar, G. Tiwari, C. Theobalt, and G. Pons-Moll, “Multi-garment net: Learning to dress 3d people from images,” in IEEE International Conference on Computer Vision (ICCV).   IEEE, Oct 2019.
  27. Q. Ma, J. Yang, A. Ranjan, S. Pujades, G. Pons-Moll, S. Tang, and M. J. Black, “Learning to dress 3d people in generative clothing,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 6469–6478.
  28. B. L. Bhatnagar, C. Sminchisescu, C. Theobalt, and G. Pons-Moll, “Loopreg: Self-supervised learning of implicit surface correspondences, pose and shape for 3d human mesh registration,” in Neural Information Processing Systems (NeurIPS), December 2020.
  29. B. Deng, J. P. Lewis, T. Jeruzalski, G. Pons-Moll, G. Hinton, M. Norouzi, and A. Tagliasacchi, “Nasa neural articulated shape approximation,” in European Conference on Computer Vision (ECCV).   Springer, 2020, pp. 612–628.
  30. X. Chen, Y. Zheng, M. J. Black, O. Hilliges, and A. Geiger, “Snarf: Differentiable forward skinning for animating non-rigid neural implicit shapes,” in International Conference on Computer Vision (ICCV), 2021.
  31. G. Tiwari, N. Sarafianos, T. Tung, and G. Pons-Moll, “Neural-gif: Neural generalized implicit functions for animating people in clothing,” in IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 11 708–11 718.
  32. M. Mihajlovic, Y. Zhang, M. J. Black, and S. Tang, “Leap: Learning articulated occupancy of people,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 10 461–10 471.
  33. T. Alldieck, H. Xu, and C. Sminchisescu, “imghum: Implicit generative models of 3d human shape and articulated pose,” in IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 5461–5470.
  34. P. Palafox, A. Božič, J. Thies, M. Nießner, and A. Dai, “Npms: Neural parametric models for 3d deformable shapes,” in IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
  35. T. Yu, Z. Zheng, Y. Zhong, J. Zhao, Q. Dai, G. Pons-Moll, and Y. Liu, “Simulcap: Single-view human performance capture with cloth simulation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 5504–5514.
  36. Y. Feng, J. Yang, M. Pollefeys, M. J. Black, and T. Bolkart, “Capturing and animation of body and clothing from monocular video,” in SIGGRAPH Asia 2022 Conference Papers, ser. SA ’22, 2022.
  37. B. Guillard, F. Stella, and P. Fua, “Meshudf: Fast and differentiable meshing of unsigned distance field networks,” in European Conference on Computer Vision, 2022.
  38. F. Zhao, W. Wang, S. Liao, and L. Shao, “Learning anchored unsigned distance functions with gradient direction alignment for single-view garment reconstruction,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 12 674–12 683.
  39. X. Long, C. Lin, L. Liu, Y. Liu, P. Wang, C. Theobalt, T. Komura, and W. Wang, “Neuraludf: Learning unsigned distance fields for multi-view reconstruction of surfaces with arbitrary topologies,” arXiv preprint arXiv:2211.14173, 2022.
  40. J. Chibane, A. Mir, and G. Pons-Moll, “Neural unsigned distance fields for implicit function learning,” in Advances in Neural Information Processing Systems (NeurIPS), December 2020.
  41. H. Xu, E. G. Bazavan, A. Zanfir, W. T. Freeman, R. Sukthankar, and C. Sminchisescu, “Ghum & ghuml: Generative 3d human shape and articulated pose models,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 6184–6193.
  42. G. Pavlakos, V. Choutas, N. Ghorbani, T. Bolkart, A. A. Osman, D. Tzionas, and M. J. Black, “Expressive body capture: 3d hands, face, and body from a single image,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 10 975–10 985.
  43. N. Hesse, S. Pujades, M. J. Black, M. Arens, U. G. Hofmann, and A. S. Schroeder, “Learning and tracking the 3d body shape of freely moving infants from rgb-d sequences,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 10, pp. 2540–2551, 2019.
  44. A. A. Osman, T. Bolkart, and M. J. Black, “Star: Sparse trained articulated human body regressor,” in European Conference on Computer Vision (ECCV).   Springer, 2020, pp. 598–613.
  45. H. Joo, T. Simon, and Y. Sheikh, “Total capture: A 3d deformation model for tracking faces, hands, and bodies,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 8320–8329.
  46. N. Mahmood, N. Ghorbani, N. F. Troje, G. Pons-Moll, and M. J. Black, “Amass: Archive of motion capture as surface shapes,” in IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 5442–5451.
  47. I. Santesteban, E. Garces, M. A. Otaduy, and D. Casas, “Softsmpl: Data-driven modeling of nonlinear soft-tissue dynamics for parametric humans,” in Computer Graphics Forum, vol. 39, no. 2.   Wiley Online Library, 2020, pp. 65–75.
  48. Z. Su, T. Yu, Y. Wang, and Y. Liu, “Deepcloth: Neural garment representation for shape and style editing,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 2, pp. 1581–1593, 2023.
  49. I. Santesteban, N. Thuerey, M. A. Otaduy, and D. Casas, “Self-Supervised Collision Handling via Generative 3D Garment Models for Virtual Try-On,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  50. S. Yang, Z. Pan, T. Amert, K. Wang, L. Yu, T. Berg, and M. C. Lin, “Physics-inspired garment recovery from a single-view image,” ACM Transactions on Graphics (TOG), vol. 37, no. 5, pp. 1–14, 2018.
  51. Z. Su, W. Wan, T. Yu, L. Liu, L. Fang, W. Wang, and Y. Liu, “Mulaycap: Multi-layer human performance capture using a monocular video camera,” 2020.
  52. W. E. Lorensen and H. E. Cline, “Marching cubes: A high resolution 3d surface construction algorithm,” ACM siggraph computer graphics, vol. 21, no. 4, pp. 163–169, 1987.
  53. S. Saito, Z. Huang, R. Natsume, S. Morishima, A. Kanazawa, and H. Li, “Pifu: Pixel-aligned implicit function for high-resolution clothed human digitization,” in IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 2304–2314.
  54. T. He, J. Collomosse, H. Jin, and S. Soatto, “Geo-pifu: Geometry and pixel aligned implicit functions for single-view human reconstruction,” in Conference on Neural Information Processing Systems (NeurIPS), 2020.
  55. S. Saito, T. Simon, J. Saragih, and H. Joo, “Pifuhd: Multi-level pixel-aligned implicit function for high-resolution 3d human digitization,” in IEEE Conference on Computer Vision and Pattern Recognition, June 2020.
  56. Y. Hong, J. Zhang, B. Jiang, Y. Guo, L. Liu, and H. Bao, “Stereopifu: Depth aware clothed human digitization via stereo vision,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 535–545.
  57. Y. Zheng, R. Shao, Y. Zhang, T. Yu, Z. Zheng, Q. Dai, and Y. Liu, “Deepmulticap: Performance capture of multiple characters using sparse multiview cameras,” in IEEE Conference on Computer Vision (ICCV 2021), 2021.
  58. G. Yao, H. Wu, Y. Yuan, L. Li, K. Zhou, and X. Yu, “Learning implicit body representations from double diffusion based neural radiance fields,” in Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, L. D. Raedt, Ed.   International Joint Conferences on Artificial Intelligence Organization, 7 2022, pp. 1566–1572, main Track. [Online]. Available: https://doi.org/10.24963/ijcai.2022/218
  59. Y. Xiu, J. Yang, D. Tzionas, and M. J. Black, “ICON: Implicit Clothed humans Obtained from Normals,” in CVPR, 2022.
  60. Z. Huang, Y. Xu, C. Lassner, H. Li, and T. Tung, “Arch: Animatable reconstruction of clothed humans,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3093–3102.
  61. X. Chen, T. Jiang, J. Song, M. Rietmann, A. Geiger, M. J. Black, and O. Hilliges, “Fast-SNARF: A fast deformer for articulated neural fields,” arXiv, vol. abs/2211.15601, 2022.
  62. Z. Dong, C. Guo, J. Song, X. Chen, A. Geiger, and O. Hilliges, “PINA: Learning a personalized implicit neural avatar from a single RGB-D video sequence,” in CVPR, 2022.
  63. I. Mehta, M. Gharbi, C. Barnes, E. Shechtman, R. Ramamoorthi, and M. Chandraker, “Modulated periodic activations for generalizable local functional representations,” in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 14 194–14 203.
  64. V. Sitzmann, J. Martel, A. Bergman, D. Lindell, and G. Wetzstein, “Implicit neural representations with periodic activation functions,” Neural Information Processing Systems,Neural Information Processing Systems, Jun 2020.
  65. J. J. Park, P. Florence, J. Straub, R. Newcombe, and S. Lovegrove, “Deepsdf: Learning continuous signed distance functions for shape representation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 165–174.
  66. A. Gropp, L. Yariv, N. Haim, M. Atzmon, and Y. Lipman, “Implicit geometric regularization for learning shapes,” in Proceedings of Machine Learning and Systems 2020, 2020, pp. 3569–3579.
  67. “Website: https://renderpeople.com/,” 2023.
  68. “Website: https://secure.axyz-design.com/,” 2023.
  69. H. Bertiche, M. Madadi, and S. Escalera, “Cloth3d: clothed 3d humans,” in European Conference on Computer Vision.   Springer, 2020, pp. 344–359.
  70. “Website: https://www.clo3d.com/,” 2023.
  71. Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, and J. M. Solomon, “Dynamic graph cnn for learning on point clouds,” ACM Transactions on Graphics (TOG), 2019.
  72. G. Tiwari, B. L. Bhatnagar, T. Tung, and G. Pons-Moll, “Sizer: A dataset and model for parsing 3d clothing and learning size sensitive 3d clothing,” in European Conference on Computer Vision (ECCV).   Springer, August 2020.
  73. Q. Ma, J. Yang, S. Tang, and M. J. Black, “The power of points for modeling humans in clothing,” in IEEE/CVF International Conference on Computer Vision (ICCV), Oct. 2021.
  74. M. Kazhdan, M. Bolitho, and H. Hoppe, “Poisson surface reconstruction,” in Proceedings of the fourth Eurographics symposium on Geometry processing, vol. 7, 2006.
  75. Z. Cai, D. Ren, A. Zeng, Z. Lin, T. Yu, W. Wang, X. Fan, Y. Gao, Y. Yu, L. Pan, F. Hong, M. Zhang, C. C. Loy, L. Yang, and Z. Liu, “HuMMan: Multi-modal 4d human dataset for versatile sensing and modeling,” in 17th European Conference on Computer Vision, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part VII.   Springer, 2022, pp. 557–577.
  76. P. Li, Y. Xu, Y. Wei, and Y. Yang, “Self-correction for human parsing,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.
  77. A. Grigorev, B. Thomaszewski, M. J. Black, and O. Hilliges, “HOOD: Hierarchical graphs for generalized modelling of clothing dynamics,” 2023.
Citations (3)

Summary

We haven't generated a summary for this paper yet.