ActFormer: A GAN-based Transformer towards General Action-Conditioned 3D Human Motion Generation (2203.07706v2)
Abstract: We present a GAN-based Transformer for general action-conditioned 3D human motion generation, including not only single-person actions but also multi-person interactive actions. Our approach consists of a powerful Action-conditioned motion TransFormer (ActFormer) under a GAN training scheme, equipped with a Gaussian Process latent prior. Such a design combines the strong spatio-temporal representation capacity of Transformer, superiority in generative modeling of GAN, and inherent temporal correlations from the latent prior. Furthermore, ActFormer can be naturally extended to multi-person motions by alternately modeling temporal correlations and human interactions with Transformer encoders. To further facilitate research on multi-person motion generation, we introduce a new synthetic dataset of complex multi-person combat behaviors. Extensive experiments on NTU-13, NTU RGB+D 120, BABEL and the proposed combat dataset show that our method can adapt to various human motion representations and achieve superior performance over the state-of-the-art methods on both single-person and multi-person motion generation tasks, demonstrating a promising step towards a general human motion generator.
- Liang Xu (117 papers)
- Ziyang Song (26 papers)
- Dongliang Wang (52 papers)
- Jing Su (47 papers)
- Zhicheng Fang (2 papers)
- Chenjing Ding (8 papers)
- Weihao Gan (22 papers)
- Yichao Yan (48 papers)
- Xin Jin (285 papers)
- Xiaokang Yang (210 papers)
- Wenjun Zeng (130 papers)
- Wei Wu (483 papers)