Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 77 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 29 tok/s Pro
GPT-5 High 26 tok/s Pro
GPT-4o 103 tok/s Pro
Kimi K2 175 tok/s Pro
GPT OSS 120B 454 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

Semantic Human Mesh Reconstruction with Textures (2403.02561v2)

Published 5 Mar 2024 in cs.CV

Abstract: The field of 3D detailed human mesh reconstruction has made significant progress in recent years. However, current methods still face challenges when used in industrial applications due to unstable results, low-quality meshes, and a lack of UV unwrapping and skinning weights. In this paper, we present SHERT, a novel pipeline that can reconstruct semantic human meshes with textures and high-precision details. SHERT applies semantic- and normal-based sampling between the detailed surface (e.g. mesh and SDF) and the corresponding SMPL-X model to obtain a partially sampled semantic mesh and then generates the complete semantic mesh by our specifically designed self-supervised completion and refinement networks. Using the complete semantic mesh as a basis, we employ a texture diffusion model to create human textures that are driven by both images and texts. Our reconstructed meshes have stable UV unwrapping, high-quality triangle meshes, and consistent semantic information. The given SMPL-X model provides semantic information and shape priors, allowing SHERT to perform well even with incorrect and incomplete inputs. The semantic information also makes it easy to substitute and animate different body parts such as the face, body, and hands. Quantitative and qualitative experiments demonstrate that SHERT is capable of producing high-fidelity and robust semantic meshes that outperform state-of-the-art methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (77)
  1. Video based reconstruction of 3d people models. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2018a.
  2. Detailed human avatars from monocular video. In IEEE Conf. 3D Vis.(3DV), 2018b.
  3. Learning to reconstruct people in clothing from a single rgb camera. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2019a.
  4. Tex2Shape: Detailed full human body geometry from a single image. In Int. Conf. Comput. Vis.(ICCV), 2019b.
  5. Photorealistic monocular 3d reconstruction of humans wearing clothing. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2022.
  6. Optimal step nonrigid icp algorithms for surface registration. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2007.
  7. A method for registration of 3-d shapes. IEEE Trans. Pattern Anal. Mach. Intell.(TPAMI), 1992.
  8. A morphable model for the synthesis of 3d faces. In 26th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH 1999), 1999.
  9. Facewarehouse: A 3d facial expression database for visual computing. IEEE Trans. Vis. Comput. Graph.(TVCG), 2014.
  10. Efficient registration for human surfaces via isometric regularization on embedded deformation. IEEE Trans. Vis. Comput. Graph.(TVCG), 2022.
  11. Implicit functions in feature space for 3d shape reconstruction and completion. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2020.
  12. SMPLicit: Topology-aware generative model for clothed people. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2021.
  13. Structured 3d features for reconstructing controllable avatars. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2023.
  14. EMOCA: Emotion driven monocular face capture and animation. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2022.
  15. A survey of non-rigid 3d registration. Computer Graphics Forum (Eurographics 2022 State-of-the-Art Reports), 2022.
  16. Uv-gan: Adversarial facial uv map completion for pose-invariant face recognition. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2018.
  17. Accurate 3d face reconstruction with weakly-supervised learning: From single image to image set. In IEEE Computer Vision and Pattern Recognition Workshops, 2019.
  18. Learning an animatable detailed 3D face model from in-the-wild images. ACM Trans. Graph.(TOG), 2021.
  19. Capturing and animation of body and clothing from monocular video. In SIGGRAPH Asia 2022 Conference Proceedings, 2022.
  20. David A Field. Laplacian smoothing and delaunay triangulations. Communications in applied numerical methods, 1988.
  21. Surface mesh quality evaluation. International journal for numerical methods in engineering, 1999.
  22. Robust non-rigid motion tracking and surface reconstruction using l0 regularization. In Int. Conf. Comput. Vis.(ICCV), 2015.
  23. High-fidelity 3d human digitization from single 2k resolution images. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2023.
  24. ARCH++: Animation-ready clothed human reconstruction revisited. In Int. Conf. Comput. Vis.(ICCV), 2021.
  25. TeCH: Text-guided Reconstruction of Lifelike Clothed Humans. In International Conference on 3D Vision (3DV), 2024.
  26. ARCH: Animatable reconstruction of clothed humans. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2020.
  27. Huggingface. Stable-diffusion-inpainting, 2022. https://hug gingface.co/runwayml/stable-diffusion-inpainting.
  28. Henrik Wann Jensen. Realistic image synthesis using photon mapping. AK Peters/crc Press, 2001.
  29. BCNet: Learning body and cloth shape from a single image. In Eur. Conf. Comput. Vis.(ECCV), 2020.
  30. Mesh density adaptation for template-based shape reconstruction. In ACM SIGGRAPH 2023 Conference Proceedings, 2023.
  31. Screened poisson surface reconstruction. ACM Trans. Graph.(TOG), 2013.
  32. Poisson surface reconstruction. In Proceedings of the fourth Eurographics symposium on Geometry processing, 2006.
  33. Template-based mesh completion. In Symposium on Geometry Processing, 2005.
  34. Pose space deformation: a unified approach to shape interpolation and skeleton-driven deformation. In Proceedings of SIGGRAPH 2000, 2000.
  35. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. arXiv preprint arXiv:2301.12597, 2023.
  36. Robust non-rigid registration with reweighted position and transformation sparsity. IEEE Trans. Vis. Comput. Graph.(TVCG), 2018.
  37. Learning a model of facial shape and expression from 4d scans. ACM Trans. Graph.(TOG), 2017a.
  38. Learning a model of facial shape and expression from 4D scans. ACM Trans. Graph.(TOG), 2017b.
  39. Deep point cloud simplification for high-quality surface reconstruction. arXiv preprint arXiv:2203.09088, 2022.
  40. High-fidelity clothed avatar reconstruction from a single image. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2023.
  41. Peter Liepa. Filling holes in meshes. In Proceedings of the 2003 Eurographics/ACM SIGGRAPH symposium on Geometry processing, 2003.
  42. SMPL: A skinned multi-person linear model. ACM Trans. Graph.(TOG), 2015.
  43. Marching cubes: A high resolution 3d surface construction algorithm. Seminal graphics: pioneering efforts that shaped the field, 1998.
  44. Learning to Dress 3D People in Generative Clothing. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2020.
  45. Learning to transfer texture from clothing images to 3d humans. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2020.
  46. Large steps in inverse rendering of geometry. ACM Trans. Graph.(TOG), 2021.
  47. Expressive body capture: 3d hands, face, and body from a single image. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2019.
  48. A 3d face model for pose and illumination invariant face recognition. In 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance, 2009.
  49. Hypertexture. In Proceedings of the 16th annual conference on Computer graphics and interactive techniques, 1989.
  50. Building statistical shape spaces for 3d human modeling. Pattern Recognition, 2017.
  51. Rec-mv: Reconstructing 3d dynamic cloth from monocular videos. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2023.
  52. High-resolution image synthesis with latent diffusion models. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2022.
  53. Embodied hands: Modeling and capturing hands and bodies together. ACM Trans. Graph.(TOG), 2017.
  54. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention(MICCAI), 2015, 2015.
  55. PIFu: Pixel-aligned implicit function for high-resolution clothed human digitization. In Int. Conf. Comput. Vis.(ICCV), 2019.
  56. PIFuHD: Multi-level pixel-aligned implicit function for high-resolution 3d human digitization. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2020.
  57. Self-supervised monocular 3d face reconstruction by occlusion-aware multi-view geometry consistency. In Eur. Conf. Comput. Vis.(ECCV), 2020.
  58. X-avatar: Expressive human avatars. In Computer Vision and Pattern Recognition (CVPR), 2023.
  59. Denoising diffusion implicit models. In Int. Conf. Learn. Represent., 2021.
  60. Dinar: Diffusion inpainting of neural textures for one-shot human avatars. In Int. Conf. Comput. Vis.(ICCV), 2023.
  61. Real-esrgan: Training real-world blind super-resolution with pure synthetic data. In Int. Conf. Comput. Vis.(ICCV), 2021.
  62. Monoclothcap: Towards temporally coherent clothing capture from monocular rgb video. In IEEE Conf. 3D Vis.(3DV), 2020.
  63. Modeling clothing as a separate layer for an animatable human avatar. ACM Trans. Graph.(TOG), 2021.
  64. ICON: Implicit Clothed humans Obtained from Normals. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2022.
  65. ECON: Explicit Clothed humans Optimized via Normal integration. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2023.
  66. GHUM & GHUML: Generative 3d human shape and articulated pose models. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2020.
  67. Sparse non-rigid registration of 3d shapes. Computer Graphics Forum, 2015.
  68. D-if: Uncertainty-aware human digitization via implicit distribution field. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2023.
  69. Quasi-newton solver for robust non-rigid registration. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2020.
  70. Function4d: Real-time human volumetric capture from very sparse consumer rgbd sensors. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2021.
  71. PaMIR: Parametric model-conditioned implicit representation for image-based human reconstruction. IEEE Trans. Pattern Anal. Mach. Intell.(TPAMI), 2021.
  72. Adding conditional control to text-to-image diffusion models. In Int. Conf. Comput. Vis.(ICCV), 2023.
  73. Object-occluded human shape and pose estimation from a single color image. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2020.
  74. Detailed human shape estimation from a single image by hierarchical mesh deformation. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2019.
  75. Detailed avatar recovery from single image. IEEE Trans. Pattern Anal. Mach. Intell.(TPAMI), 2021.
  76. Registering explicit to implicit: Towards high-fidelity garment mesh reconstruction from single images. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2022.
  77. Face alignment across large poses: A 3d solution. In IEEE Conf. Comput. Vis. Pattern Recog.(CVPR), 2016.

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.