LaserHuman: Language-guided Scene-aware Human Motion Generation in Free Environment (2403.13307v2)
Abstract: Language-guided scene-aware human motion generation has great significance for entertainment and robotics. In response to the limitations of existing datasets, we introduce LaserHuman, a pioneering dataset engineered to revolutionize Scene-Text-to-Motion research. LaserHuman stands out with its inclusion of genuine human motions within 3D environments, unbounded free-form natural language descriptions, a blend of indoor and outdoor scenarios, and dynamic, ever-changing scenes. Diverse modalities of capture data and rich annotations present great opportunities for the research of conditional motion generation, and can also facilitate the development of real-life applications. Moreover, to generate semantically consistent and physically plausible human motions, we propose a multi-conditional diffusion model, which is simple but effective, achieving state-of-the-art performance on existing datasets.
- Peishan Cong (12 papers)
- Yiming Ren (22 papers)
- Wei Yin (58 papers)
- Kai Cheng (38 papers)
- Yujing Sun (21 papers)
- Xiaoxiao Long (47 papers)
- Xinge Zhu (62 papers)
- Yuexin Ma (98 papers)
- Ziyi Wang (449 papers)
- Zhiyang Dou (34 papers)