Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 62 tok/s

Gemini 2.5 Pro 49 tok/s Pro

GPT-5 Medium 20 tok/s Pro

GPT-5 High 22 tok/s Pro

GPT-4o 93 tok/s Pro

Kimi K2 199 tok/s Pro

GPT OSS 120B 459 tok/s Pro

Claude Sonnet 4.5 34 tok/s Pro

2000 character limit reached

Making Pose Representations More Expressive and Disentangled via Residual Vector Quantization (2508.14561v1)

Published 20 Aug 2025 in cs.CV and cs.RO

Abstract: Recent progress in text-to-motion has advanced both 3D human motion generation and text-based motion control. Controllable motion generation (CoMo), which enables intuitive control, typically relies on pose code representations, but discrete pose codes alone cannot capture fine-grained motion details, limiting expressiveness. To overcome this, we propose a method that augments pose code-based latent representations with continuous motion features using residual vector quantization (RVQ). This design preserves the interpretability and manipulability of pose codes while effectively capturing subtle motion characteristics such as high-frequency details. Experiments on the HumanML3D dataset show that our model reduces Frechet inception distance (FID) from 0.041 to 0.015 and improves Top-1 R-Precision from 0.508 to 0.510. Qualitative analysis of pairwise direction similarity between pose codes further confirms the model's controllability for motion editing.