Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 159 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 20 tok/s Pro
GPT-5 High 23 tok/s Pro
GPT-4o 118 tok/s Pro
Kimi K2 193 tok/s Pro
GPT OSS 120B 430 tok/s Pro
Claude Sonnet 4.5 34 tok/s Pro
2000 character limit reached

TIMotion: Temporal and Interactive Framework for Efficient Human-Human Motion Generation (2408.17135v4)

Published 30 Aug 2024 in cs.CV

Abstract: Human-human motion generation is essential for understanding humans as social beings. Current methods fall into two main categories: single-person-based methods and separate modeling-based methods. To delve into this field, we abstract the overall generation process into a general framework MetaMotion, which consists of two phases: temporal modeling and interaction mixing. For temporal modeling, the single-person-based methods concatenate two people into a single one directly, while the separate modeling-based methods skip the modeling of interaction sequences. The inadequate modeling described above resulted in sub-optimal performance and redundant model parameters. In this paper, we introduce TIMotion (Temporal and Interactive Modeling), an efficient and effective framework for human-human motion generation. Specifically, we first propose Causal Interactive Injection to model two separate sequences as a causal sequence leveraging the temporal and causal properties. Then we present Role-Evolving Scanning to adjust to the change in the active and passive roles throughout the interaction. Finally, to generate smoother and more rational motion, we design Localized Pattern Amplification to capture short-term motion patterns. Extensive experiments on InterHuman and InterX demonstrate that our method achieves superior performance. Project page: https://aigc-explorer.github.io/TIMotion-page/

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Rhythmic gesticulator: Rhythm-aware co-speech gesture synthesis with hierarchical neural embeddings. ACM Transactions on Graphics (TOG), 41(6): 1–19.
  2. Bethke, E. 2003. Game development and production. Wordware Publishing, Inc.
  3. Digital life project: Autonomous 3d characters with social intelligence. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 582–592.
  4. Executing your commands via motion diffusion in latent space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18000–18010.
  5. Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34: 8780–8794.
  6. Vision-rwkv: Efficient and scalable visual perception with rwkv-like architectures. arXiv preprint arXiv:2403.02308.
  7. Diffusion-rwkv: Scaling rwkv-like architectures for diffusion models. arXiv preprint arXiv:2404.04478.
  8. Tm2d: Bimodality driven 3d dance generation via music-text integration. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 9942–9952.
  9. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752.
  10. RWKV-CLIP: A Robust Vision-Language Representation Learner. arXiv preprint arXiv:2406.06973.
  11. Momask: Generative masked modeling of 3d human motions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1900–1910.
  12. Generating diverse and natural 3d human motions from text. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5152–5161.
  13. Action2motion: Conditioned generation of 3d human motions. In Proceedings of the 28th ACM International Conference on Multimedia, 2021–2029.
  14. A motion matching-based framework for controllable gesture synthesis from speech. In ACM SIGGRAPH 2022 conference proceedings, 1–9.
  15. PointRWKV: Efficient RWKV-Like Model for Hierarchical Point Cloud Learning. arXiv preprint arXiv:2405.15214.
  16. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33: 6840–6851.
  17. Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598.
  18. Long short-term memory. Neural computation, 9(8): 1735–1780.
  19. Intergen: Diffusion-based multi-human motion generation under complex interactions. International Journal of Computer Vision, 1–21.
  20. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101.
  21. Computer animation. Springer.
  22. Improved denoising diffusion probabilistic models. In International conference on machine learning, 8162–8171. PMLR.
  23. Parent, R. 2012. Computer animation: algorithms and techniques. Newnes.
  24. Rwkv: Reinventing rnns for the transformer era. arXiv preprint arXiv:2305.13048.
  25. Action-conditioned 3d human motion synthesis with transformer vae. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 10985–10995.
  26. TEMOS: Generating diverse human motions from textual descriptions. In European Conference on Computer Vision, 480–497. Springer.
  27. Mmm: Generative masked motion model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1546–1555.
  28. Learning transferable visual models from natural language supervision. In International conference on machine learning, 8748–8763. PMLR.
  29. Saridis, G. 1983. Intelligent robotic control. IEEE Transactions on Automatic Control, 28(5): 547–557.
  30. Human motion diffusion as a generative prior. arXiv preprint arXiv:2303.01418.
  31. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502.
  32. Role-aware interaction generation from textual description. In Proceedings of the IEEE/CVF international conference on computer vision, 15999–16009.
  33. Motionclip: Exposing human motion generation to clip space. In European Conference on Computer Vision, 358–374. Springer.
  34. Human motion diffusion model. arXiv preprint arXiv:2209.14916.
  35. Urbain, J. 2010. Introduction to game development. Cell, 414: 745–5102.
  36. Neural discrete representation learning. Advances in neural information processing systems, 30.
  37. Vaswani, A. 2017. Attention is all you need. arXiv preprint arXiv:1706.03762.
  38. Towards domain generalization for multi-view 3d object detection in bird-eye-view. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13333–13342.
  39. Intercontrol: Generate human motion interactions by controlling every joint. arXiv preprint arXiv:2311.15864.
  40. Inter-x: Towards versatile human-human interaction analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 22260–22271.
  41. Restore-RWKV: Efficient and Effective Medical Image Restoration with RWKV. arXiv preprint arXiv:2407.11087.
  42. Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything Model. arXiv preprint arXiv:2406.19369.
  43. Generating human motion from textual descriptions with discrete representations. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 14730–14740.
  44. Motiondiffuse: Text-driven human motion generation with diffusion model. IEEE Transactions on Pattern Analysis and Machine Intelligence.
  45. Remodiffuse: Retrieval-augmented motion diffusion model. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 364–373.
  46. Finemogen: Fine-grained spatio-temporal motion generation and editing. Advances in Neural Information Processing Systems, 36.
  47. Attt2m: Text-driven human motion generation with multi-perspective attention mechanism. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 509–519.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Questions

We haven't generated a list of open questions mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper:

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube