Generating Smooth Pose Sequences for Diverse Human Motion Prediction (2108.08422v3)

Published 19 Aug 2021 in cs.CV

Abstract: Recent progress in stochastic motion prediction, i.e., predicting multiple possible future human motions given a single past pose sequence, has led to producing truly diverse future motions and even providing control over the motion of some body parts. However, to achieve this, the state-of-the-art method requires learning several mappings for diversity and a dedicated model for controllable motion prediction. In this paper, we introduce a unified deep generative network for both diverse and controllable motion prediction. To this end, we leverage the intuition that realistic human motions consist of smooth sequences of valid poses, and that, given limited data, learning a pose prior is much more tractable than a motion one. We therefore design a generator that predicts the motion of different body parts sequentially, and introduce a normalizing flow based pose prior, together with a joint angle loss, to achieve motion realism.Our experiments on two standard benchmark datasets, Human3.6M and HumanEva-I, demonstrate that our approach outperforms the state-of-the-art baselines in terms of both sample diversity and accuracy. The code is available at https://github.com/wei-mao-2019/gsps

Citations (67)

View on Semantic Scholar

Summary

The paper introduces a unified deep generative network that produces diverse and controllable human motion predictions from past pose sequences.
It employs normalizing flows and a joint angle constraint to ensure realistic, smooth, and physiologically valid pose sequences.
Experiments on Human3.6M and HumanEva-I datasets demonstrate superior diversity (APD) and accuracy (ADE, FDE) compared to current state-of-the-art methods.

Generating Smooth Pose Sequences for Diverse Human Motion Prediction

This paper addresses the challenge of predicting diverse and realistic future human motions from a sequence of past poses, a task with significant implications in fields such as autonomous driving, animation, and human-robot interaction. The core contribution lies in an innovative model capable of producing both diverse and controllable motion predictions through a unified deep generative approach.

Overview

Traditional deterministic approaches to human motion prediction focus on predicting the most likely future sequence from past data. However, human motion can naturally lead to multiple plausible futures, especially over longer time horizons. Stochastic motion prediction, often leveraging VAEs, tends to overemphasize major data distribution modes at the cost of diversity. Current methodologies, like those leveraging multiple parallel mappings, achieve diversity but require separate models for diverse and controllable predictions.

This paper introduces a consolidated approach utilizing a deep generative network that performs both tasks by generating motions for different body parts sequentially. Specifically, it employs a pose prior modeled by normalizing flows and a joint angle constraint to ensure pose validity and sequence smoothness.

Methodology

Pose Prior and Joint Angle Constraint: The model employs a normalizing flow for realistic pose sequences, allowing exact log-likelihood computation, promoting pose validity (ensuring generation of feasible human poses), and encouraging diversity in sample outputs. A joint angle loss is introduced to respect human kinematic constraints, improving realism by enforcing physiological limits on joint angles.

Sequential Prediction of Body Parts: Departing from the concept of a unified motion prediction, the proposed model predicts future poses for distinct body parts in sequence. This design inherently enables controllable predictions, such as fixing one body part’s motion while allowing variance in others. This is achieved without the need for separate models, a limitation in prior work like DLow.

Performance and Results: The model's efficacy was established through testing on Human3.6M and HumanEva-I datasets, showing superior performance in both diversity (APD) and accuracy (ADE, FDE) metrics over contemporary methods. The approach also demonstrated improved part-based motion control, providing more granular and realistic predictions of human movement.

Implications

The development of this model holds notable practical significance—enhancing the realism and flexibility of animations in gaming and film, improving the predictability of human movements in robotic systems, and driving advances in autonomous system navigation where anticipating human motion is vital.

Theoretically, it showcases the potential of integrating deep generative models with structured constraints (pose priors and joint angle limits) to solve complex motion prediction problems. This unified framework challenges existing paradigms that separate controllable from diverse motion generation and sets a precedent for future work aiming to bridge these functionalities.

Future Directions

Future research may explore further refinement in the granularity of control over body parts' motion, adapting the model for dynamic environments, and integrating semantic context for even higher prediction accuracy. Extending the framework to real-time applications and broader datasets would further cement its applicability across various domains requiring human motion understanding and prediction.

In summary, this paper contributes a significant stride towards more sophisticated human motion prediction models, effectively balancing diversity and control within a single generative framework.

PDF Markdown

Related Papers

GitHub

GitHub - wei-mao-2019/gsps: Official implementation for the paper: Generating Smooth Pose Sequences for Diverse Human Motion Prediction (42 stars)

YouTube

Show All Videos