Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Generative Motion Stylization of Cross-structure Characters within Canonical Motion Space (2403.11469v2)

Published 18 Mar 2024 in cs.CV and cs.GR

Abstract: Stylized motion breathes life into characters. However, the fixed skeleton structure and style representation hinder existing data-driven motion synthesis methods from generating stylized motion for various characters. In this work, we propose a generative motion stylization pipeline, named MotionS, for synthesizing diverse and stylized motion on cross-structure characters using cross-modality style prompts. Our key insight is to embed motion style into a cross-modality latent space and perceive the cross-structure skeleton topologies, allowing for motion stylization within a canonical motion space. Specifically, the large-scale Contrastive-Language-Image-Pre-training (CLIP) model is leveraged to construct the cross-modality latent space, enabling flexible style representation within it. Additionally, two topology-encoded tokens are learned to capture the canonical and specific skeleton topologies, facilitating cross-structure topology shifting. Subsequently, the topology-shifted stylization diffusion is designed to generate motion content for the particular skeleton and stylize it in the shifted canonical motion space using multi-modality style descriptions. Through an extensive set of examples, we demonstrate the flexibility and generalizability of our pipeline across various characters and style descriptions. Qualitative and quantitative comparisons show the superiority of our pipeline over state-of-the-arts, consistently delivering high-quality stylized motion across a broad spectrum of skeletal structures.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Skeleton-aware networks for deep motion retargeting. ACM Transactions on Graphics 39, 4 (2020), 62–1.
  2. Unpaired motion style transfer from video to animation. ACM Transactions on Graphics (TOG) 39, 4 (2020), 64–1.
  3. Emotion from motion. In Graphics interface, Vol. 96. Toronto, Canada, 222–229.
  4. Rhythmic gesticulator: Rhythm-aware co-speech gesture synthesis with hierarchical neural embeddings. ACM Transactions on Graphics (TOG) 41, 6 (2022), 1–19.
  5. GestureDiffuCLIP: Gesture diffusion model with CLIP latents. arXiv preprint arXiv:2303.14613 (2023).
  6. Learned queries for efficient local attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10841–10852.
  7. Matthew Brand and Aaron Hertzmann. 2000. Style machines. In Proceedings of the 27th annual conference on Computer graphics and interactive techniques. 183–192.
  8. Executing your Commands via Motion Diffusion in Latent Space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18000–18010.
  9. Adult2child: Motion style transfer using cyclegans. In Proceedings of the 13th ACM SIGGRAPH Conference on Motion, Interaction and Games. 1–11.
  10. C· ASE: Learning conditional adversarial skill embeddings for physics-based characters. In SIGGRAPH Asia 2023 Conference Papers. 1–11.
  11. Stylistic locomotion modeling and synthesis using variational generative models. In Proceedings of the 12th ACM SIGGRAPH Conference on Motion, Interaction and Games. 1–10.
  12. Generative adversarial nets. Advances in neural information processing systems 27 (2014).
  13. Action2motion: Conditioned generation of 3d human motions. In Proceedings of the 28th ACM International Conference on Multimedia. 2021–2029.
  14. Denoising diffusion probabilistic models. Advances in neural information processing systems 33 (2020), 6840–6851.
  15. Fast neural style transfer for motion data. IEEE computer graphics and applications 37, 4 (2017), 42–49.
  16. A deep learning framework for character motion synthesis and editing. ACM Transactions on Graphics (TOG) 35, 4 (2016), 1–11.
  17. Style translation for human motion. In ACM SIGGRAPH 2005 Papers. 1082–1089.
  18. Xun Huang and Serge Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE international conference on computer vision. 1501–1510.
  19. Motion puzzle: Arbitrary motion style transfer by body part. ACM Transactions on Graphics (TOG) 41, 3 (2022), 1–16.
  20. MOCHA: Real-Time Motion Characterization via Context Matching. In SIGGRAPH Asia 2023 Conference Papers. 1–11.
  21. MotionGPT: Human Motion as a Foreign Language. arXiv preprint arXiv:2306.14795 (2023).
  22. GANimator: Neural Motion Synthesis from a Single Sequence. ACM Transactions on Graphics (TOG) 41, 4 (2022), 138.
  23. SMPL: A skinned multi-person linear model. ACM transactions on graphics (TOG) 34, 6 (2015), 1–16.
  24. AMASS: Archive of motion capture as surface shapes. In Proceedings of the IEEE/CVF international conference on computer vision. 5442–5451.
  25. Real-time style modelling of human locomotion via feature-wise transformations and local motion phases. Proceedings of the ACM on Computer Graphics and Interactive Techniques 5, 1 (2022), 1–18.
  26. Diverse motion stylization for multiple style domains via spatial-temporal graph-based generative model. Proceedings of the ACM on Computer Graphics and Interactive Techniques 4, 3 (2021), 1–17.
  27. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019).
  28. Action-conditioned 3D human motion synthesis with transformer VAE. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 10985–10995.
  29. BABEL: Bodies, action and behavior with english labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 722–731.
  30. Single Motion Diffusion. arXiv preprint arXiv:2302.05905 (2023).
  31. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748–8763.
  32. Efficient neural networks for real-time motion style transfer. Proceedings of the ACM on Computer Graphics and Interactive Techniques 2, 2 (2019), 1–17.
  33. FineStyle: Semantic-Aware Fine-Grained Motion Style Transfer with Dual Interactive-Flow Fusion. IEEE Transactions on Visualization and Computer Graphics (2023).
  34. Putting people in their place: Monocular regression of 3d people in depth. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13243–13252.
  35. Motionclip: Exposing human motion generation to clip space. In European Conference on Computer Vision. Springer, 358–374.
  36. Human motion diffusion model. arXiv preprint arXiv:2209.14916 (2022).
  37. Tlcontrol: Trajectory and language control for human motion synthesis. arXiv preprint arXiv:2311.17135 (2023).
  38. Realtime style transfer for unlabeled heterogeneous human motion. ACM Transactions on Graphics (TOG) 34, 4 (2015), 1–10.
  39. M Ersin Yumer and Niloy J Mitra. 2016. Spectral style transfer for human motion between independent actions. ACM Transactions on Graphics (TOG) 35, 4 (2016), 1–8.
  40. TapMo: Shape-aware Motion Generation of Skeleton-free Characters. arXiv preprint arXiv:2310.12678 (2023).
  41. ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model. arXiv preprint arXiv:2304.01116 (2023).
  42. On the continuity of rotation representations in neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5745–5753.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets