Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DiffusionPhase: Motion Diffusion in Frequency Domain (2312.04036v1)

Published 7 Dec 2023 in cs.CV and cs.LG

Abstract: In this study, we introduce a learning-based method for generating high-quality human motion sequences from text descriptions (e.g., ``A person walks forward"). Existing techniques struggle with motion diversity and smooth transitions in generating arbitrary-length motion sequences, due to limited text-to-motion datasets and the pose representations used that often lack expressiveness or compactness. To address these issues, we propose the first method for text-conditioned human motion generation in the frequency domain of motions. We develop a network encoder that converts the motion space into a compact yet expressive parameterized phase space with high-frequency details encoded, capturing the local periodicity of motions in time and space with high accuracy. We also introduce a conditional diffusion model for predicting periodic motion parameters based on text descriptions and a start pose, efficiently achieving smooth transitions between motion sequences associated with different text descriptions. Experiments demonstrate that our approach outperforms current methods in generating a broader variety of high-quality motions, and synthesizing long sequences with natural transitions.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Weilin Wan (9 papers)
  2. Yiming Huang (55 papers)
  3. Shutong Wu (8 papers)
  4. Taku Komura (66 papers)
  5. Wenping Wang (184 papers)
  6. Dinesh Jayaraman (65 papers)
  7. Lingjie Liu (79 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.