AniArtAvatar: Animatable 3D Art Avatar from a Single Image (2403.17631v1)
Abstract: We present a novel approach for generating animatable 3D-aware art avatars from a single image, with controllable facial expressions, head poses, and shoulder movements. Unlike previous reenactment methods, our approach utilizes a view-conditioned 2D diffusion model to synthesize multi-view images from a single art portrait with a neutral expression. With the generated colors and normals, we synthesize a static avatar using an SDF-based neural surface. For avatar animation, we extract control points, transfer the motion with these points, and deform the implicit canonical space. Firstly, we render the front image of the avatar, extract the 2D landmarks, and project them to the 3D space using a trained SDF network. We extract 3D driving landmarks using 3DMM and transfer the motion to the avatar landmarks. To animate the avatar pose, we manually set the body height and bound the head and torso of an avatar with two cages. The head and torso can be animated by transforming the two cages. Our approach is a one-shot pipeline that can be applied to various styles. Experiments demonstrate that our method can generate high-quality 3D art avatars with desired control over different motions.
- “A morphable model for the synthesis of 3d faces,” in Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, USA, 1999, SIGGRAPH ’99, p. 187–194, ACM Press/Addison-Wesley Publishing Co.
- “Smpl: A skinned multi-person linear model,” ACM Trans. Graph., vol. 34, no. 6, oct 2015.
- “Headnerf: A real-time nerf-based parametric head model,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- “Dynamic neural radiance fields for monocular 4d facial avatar reconstruction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2021, pp. 8649–8658.
- “Neural head avatars from monocular rgb videos,” arXiv preprint arXiv:2112.01554, 2021.
- “I M Avatar: Implicit morphable head avatars from videos,” in Computer Vision and Pattern Recognition (CVPR), 2022.
- “Instant volumetric head avatars,” in CVPR, 2023.
- “Avatarclip: Zero-shot text-driven generation and animation of 3d avatars,” ACM Transactions on Graphics (TOG), vol. 41, no. 4, pp. 1–19, 2022.
- “Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction,” arXiv preprint arXiv:2106.10689, 2021.
- “Styleavatar3d: Leveraging image-text diffusion models for high-fidelity 3d avatar generation,” 2023.
- “Analyzing and improving the image quality of StyleGAN,” in Proc. CVPR, 2020.
- “Toontalker: Cross-domain face reenactment,” 2023.
- “Zero-shot text-guided object generation with dream fields,” 2022.
- “Dreamfusion: Text-to-3d using 2d diffusion,” arXiv preprint arXiv:2209.14988, 2022.
- “Magic3d: High-resolution text-to-3d content creation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 300–309.
- “Re-imagine the negative prompt algorithm: Transform 2d diffusion into 3d, alleviate janus problem and beyond,” arXiv preprint arXiv:2304.04968, 2023.
- “Fantasia3d: Disentangling geometry and appearance for high-quality text-to-3d content creation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2023.
- “It3d: Improved text-to-3d generation with explicit view synthesis,” 2023.
- “Dreamtime: An improved optimization strategy for text-to-3d content creation,” 2023.
- “Realfusion: 360° reconstruction of any object from a single image,” in Arxiv, 2023.
- “Magic123: One image to high-quality 3d object generation using both 2d and 3d diffusion priors,” arXiv preprint arXiv:2306.17843, 2023.
- “Dreambooth3d: Subject-driven text-to-3d generation,” ICCV, 2023.
- “Anything-3d: Towards single-view anything reconstruction in the wild,” 2023.
- “Diffusion with forward models: Solving stochastic inverse problems without direct supervision,” in arXiv, 2023.
- “Textmesh: Generation of realistic 3d meshes from text prompts,” arXiv preprint arXiv:2304.12439, 2023.
- “3d-aware image generation using 2d diffusion models,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2023, pp. 2383–2393.
- “Text2nerf: Text-driven 3d scene generation with neural radiance fields,” 2023.
- “Zero-1-to-3: Zero-shot one image to 3d object,” 2023.
- “One-2-3-45: Any single image to 3d mesh in 45 seconds without per-shape optimization,” arXiv preprint arXiv:2306.16928, 2023.
- “Viewset diffusion: (0-)image-conditioned 3D generative models from 2D data,” in ICCV, 2023.
- “Syncdreamer: Generating multiview-consistent images from a single-view image,” arXiv preprint arXiv:2309.03453, 2023.
- “Mvdream: Multi-view diffusion for 3d generation,” arXiv:2308.16512, 2023.
- “Wonder3d: Single image to 3d using cross-domain diffusion,” arXiv preprint arXiv:2310.15008, 2023.
- “Dreamavatar: Text-and-shape guided 3d human avatar generation via diffusion models,” arXiv preprint arXiv:2304.00916, 2023.
- “Dreamhuman: Animatable 3d avatars from text,” 2023.
- “Control4d: Efficient 4d portrait editing with text,” 2023.
- Shaoxu Li, “Instruct-video2avatar: Video-to-avatar generation with instructions,” 2023.
- “Rabit: Parametric modeling of 3d biped cartoon characters with a topological-consistent dataset,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- “Adding conditional control to text-to-image diffusion models,” 2023.
- “Avatarverse: High-quality & stable 3d avatar creation from text and pose,” 2023.
- “Rodin: A generative model for sculpting 3d digital avatars using diffusion,” 2022.
- “Instructpix2pix: Learning to follow image editing instructions,” in CVPR, 2023.
- “Text2control3d: Controllable 3d avatar generation in neural radiance fields using geometry-guided text-to-image diffusion model,” 2023.
- “Alteredavatar: Stylizing dynamic 3d avatars with fast style adaptation,” arXiv:2305.19245, 2023.
- “3dcaricshop: A dataset and a baseline method for single-view 3d caricature face reconstruction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 10236–10245.
- “Simpmodeling: Sketching implicit field to guide mesh modeling for 3d animalmorphic head design,” in The 34th Annual ACM Symposium on User Interface Software and Technology, 2021, pp. 854–863.
- “Reenactgan: Learning to reenact faces via boundary transfer,” in ECCV, September 2018.
- “Few-shot adversarial learning of realistic neural talking head models,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 9458–9467.
- “Freenet: Multi-identity face reenactment,” in CVPR, 2020, pp. 5326–5335.
- “Puppeteergan: Arbitrary portrait animation with semantic-aware appearance transformation,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 13515–13524.
- “Pirenderer: Controllable portrait image generation via semantic neural rendering,” 2021.
- “Headgan: One-shot neural head synthesis and editing,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2021, pp. 14398–14407.
- “Finding directions in gan’s latent space for neural face reenactment,” 2022.
- “Styleheat: One-shot high-resolution editable talking face generation via pre-trained stylegan,” arxiv:2203.04036, 2022.
- “First order motion model for image animation,” in Conference on Neural Information Processing Systems (NeurIPS), December 2019.
- “Animating arbitrary objects via deep motion transfer,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 2372–2381.
- “One-shot free-view neural talking-head synthesis for video conferencing,” 2021.
- “X2face: A network for controlling face generation by using images, audio, and pose codes,” 2018.
- “Recycle-gan: Unsupervised video retargeting,” in ECCV, 2018.
- “Makeittalk: Speaker-aware talking-head animation,” ACM Transactions on Graphics, vol. 39, no. 6, 2020.
- “Pareidolia face reenactment,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- “Motion and appearance adaptation for cross-domain motion transfer,” 2022.
- “Animeceleb: Large-scale animation celebheads dataset for head reenactment,” in Proc. of the European Conference on Computer Vision (ECCV), 2022.
- “High-resolution image synthesis with latent diffusion models,” 2021.
- “Compositional visual generation with composable diffusion models,” in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XVII. Springer, 2022, pp. 423–439.
- “The face of art: Landmark detection and geometric style in portraits,” ACM Transactions on Graphics (TOG), vol. 38, no. 4, pp. 1–15, 2019.
- “Landmark-guided deformation transfer of template facial expressions for automatic generation of avatar blendshapes,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, Oct 2019.
- “Nerf-editing: Geometry editing of neural radiance fields,” in Computer Vision and Pattern Recognition (CVPR), 2022.
- “Deforming radiance fields with cages,” in ECCV, 2022.
- “Cagenerf: Cage-based neural radiance fields for genrenlized 3d deformation and animation,” in Thirty-Sixth Conference on Neural Information Processing Systems, 2022.
- Shaoxu Li and Ye Pan, “Interactive geometry editing of neural radiance fields,” 2023.
- “Dagan++: Depth-aware generative adversarial network for talking head video generation,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023.
- “Latent image animator: Learning to animate images via latent space navigation,” in International Conference on Learning Representations, 2022.
- “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” in Proceedings of the 31st International Conference on Neural Information Processing Systems, Red Hook, NY, USA, 2017, NIPS’17, p. 6629–6640, Curran Associates Inc.
- “A no-reference image blur metric based on the cumulative probability of blur detection (cpbd),” IEEE Transactions on Image Processing, vol. 20, no. 9, pp. 2678–2683, 2011.
- Shaoxu Li (6 papers)