Locally Adaptive Neural 3D Morphable Models (2401.02937v1)
Abstract: We present the Locally Adaptive Morphable Model (LAMM), a highly flexible Auto-Encoder (AE) framework for learning to generate and manipulate 3D meshes. We train our architecture following a simple self-supervised training scheme in which input displacements over a set of sparse control vertices are used to overwrite the encoded geometry in order to transform one training sample into another. During inference, our model produces a dense output that adheres locally to the specified sparse geometry while maintaining the overall appearance of the encoded object. This approach results in state-of-the-art performance in both disentangling manipulated geometry and 3D mesh reconstruction. To the best of our knowledge LAMM is the first end-to-end framework that enables direct local control of 3D vertex geometry in a single forward pass. A very efficient computational graph allows our network to train with only a fraction of the memory required by previous methods and run faster during inference, generating 12k vertex meshes at $>$60fps on a single CPU thread. We further leverage local geometry control as a primitive for higher level editing operations and present a set of derivative capabilities such as swapping and sampling object parts. Code and pretrained models can be found at https://github.com/michaeltrs/LAMM.
- 3d scan store. https://www.3dscanstore.com/. Accessed: 2023-11-13.
- Face editing using part-based optimization of the latent space. Computer Graphics Forum, 2023.
- Geometric disentanglement for generative latent shape models. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 8180–8189, 2019.
- A morphable model for the synthesis of 3d faces. In Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, page 187–194, USA, 1999. ACM Press/Addison-Wesley Publishing Co.
- A 3d morphable model learnt from 10,000 faces. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5543–5552, 2016.
- 3d face morphable models” in-the-wild”. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 48–57, 2017.
- Neural 3d morphable models: Spiral convolutional networks for 3d shape representation learning and generation. In The IEEE International Conference on Computer Vision (ICCV), 2019.
- End-to-end object detection with transformers. In Computer Vision – ECCV 2020, pages 213–229, Cham, 2020. Springer International Publishing.
- Persistent nature: A generative model of unbounded 3d worlds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 20863–20874, 2023.
- Synthetic data in machine learning for medicine and healthcare. Nat. Biomed. Eng., 5(6):493–497, 2021.
- Blender Online Community. Blender - a 3D modelling and rendering package. Blender Foundation, Stichting Blender Foundation, Amsterdam, 2018.
- Structured 3d features for reconstructing relightable and animatable avatars. In CVPR, 2023.
- Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2016.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota, 2019. Association for Computational Linguistics.
- An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021.
- 3D Morphable Face Models – Past, Present and Future. arXiv e-prints, 2019.
- 3d shape variational autoencoder latent disentanglement via mini-batch feature swapping for bodies and faces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 18730–18739, 2022.
- 3d generative model latent disentanglement via local eigenprojection. Computer Graphics Forum, 42(6):e14793, 2023.
- Ganfit: Generative adversarial network fitting for high fidelity 3d face reconstruction. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1155–1164, 2019.
- Local control editing paradigms for part-based 3D face morphable models. Computer Animation and Virtual Worlds, 32(6):e2028.
- Spiralnet++: A fast and highly efficient mesh convolution operator. In Proceedings of the IEEE International Conference on Computer Vision Workshops, pages 0–0, 2019.
- Decoupled weight decay regularization. Proceedings of ICLR, 2019.
- Denoising diffusion probabilistic models for 3D medical image generation. Scientific Reports, 13:7303, 2023.
- Fitme: Deep photorealistic 3d morphable model avatars. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8629–8640, 2023.
- Learning a model of facial shape and expression from 4D scans. ACM Transactions on Graphics, (Proc. SIGGRAPH Asia), 36(6):194:1–194:17, 2017.
- A simple approach to intrinsic correspondence learning on unstructured 3d meshes. In Computer Vision – ECCV 2018 Workshops: Munich, Germany, September 8-14, 2018, Proceedings, Part III, page 349–362, Berlin, Heidelberg, 2019. Springer-Verlag.
- SGDR: Stochastic gradient descent with warm restarts. In International Conference on Learning Representations, 2017.
- Relightify: Relightable 3d faces from a single image via diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
- A 3d face model for pose and illumination invariant face recognition. In 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance, pages 296–301, 2009.
- Combining 3d morphable models: A large scale face-and-head model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10934–10943, 2019.
- Towards a complete 3d morphable model of the human head. IEEE transactions on pattern analysis and machine intelligence, 43(11):4142–4160, 2020.
- 3d human tongue reconstruction from single” in-the-wild” images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2771–2780, 2022.
- Handy: Towards a high fidelity 3d hand shape and appearance model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4670–4680, 2023.
- Generating 3D faces using convolutional mesh autoencoders. In European Conference on Computer Vision (ECCV), pages 725–741, 2018.
- Ai-mediated 3d video conferencing. In ACM SIGGRAPH Emerging Technologies, 2023.
- Vits for sits: Vision transformers for satellite image time series. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10418–10428, 2023.
- Mlp-mixer: An all-mlp architecture for vision. In Advances in Neural Information Processing Systems, pages 24261–24272. Curran Associates, Inc., 2021.
- Training data-efficient image transformers & distillation through attention. In International Conference on Machine Learning, pages 10347–10357, 2021.
- Attention is all you need. In Advances in Neural Information Processing Systems 30, pages 5998–6008. Curran Associates, Inc., 2017.
- Rodin: A generative model for sculpting 3d digital avatars using diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4563–4573, 2023.
- Neo-3df: Novel editing-oriented 3d face creation and reconstruction. In Proceedings of the Asian Conference on Computer Vision, pages 486–502, 2022.
- Learning physically simulated tennis skills from broadcast videos. ACM Trans. Graph.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.