Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 65 tok/s

Gemini 2.5 Pro 40 tok/s Pro

GPT-5 Medium 26 tok/s Pro

GPT-5 High 24 tok/s Pro

GPT-4o 113 tok/s Pro

Kimi K2 200 tok/s Pro

GPT OSS 120B 445 tok/s Pro

Claude Sonnet 4.5 34 tok/s Pro

2000 character limit reached

Structured Prediction Helps 3D Human Motion Modelling (1910.09070v1)

Published 20 Oct 2019 in cs.CV

Abstract: Human motion prediction is a challenging and important task in many computer vision application domains. Existing work only implicitly models the spatial structure of the human skeleton. In this paper, we propose a novel approach that decomposes the prediction into individual joints by means of a structured prediction layer that explicitly models the joint dependencies. This is implemented via a hierarchy of small-sized neural networks connected analogously to the kinematic chains in the human body as well as a joint-wise decomposition in the loss function. The proposed layer is agnostic to the underlying network and can be used with existing architectures for motion modelling. Prior work typically leverages the H3.6M dataset. We show that some state-of-the-art techniques do not perform well when trained and tested on AMASS, a recently released dataset 14 times the size of H3.6M. Our experiments indicate that the proposed layer increases the performance of motion forecasting irrespective of the base network, joint-angle representation, and prediction horizon. We furthermore show that the layer also improves motion predictions qualitatively. We make code and models publicly available at https://ait.ethz.ch/projects/2019/spl.

Citations (179)

View on Semantic Scholar

Summary

Analysis of Structured Prediction Layer for 3D Human Motion Modelling

The paper "Structured Prediction Helps 3D Human Motion Modelling" delineates an innovative approach to enhance the accuracy and robustness of 3D human motion prediction through the introduction of a structured prediction layer (SPL). This research focuses on the explicit decomposition of human pose into individual joints, leveraging the spatial dependencies dictated by the human skeletal structure—an aspect often overlooked in prior motion prediction models.

Methodological Advancements

The SPL is implemented as a hierarchy of neural networks connected according to the kinematic chains in the human body. This method enables the prediction of individual joint movements based on their parent joint, capturing spatial dependencies inherently present in human motion. Importantly, the SPL is independent of the base architecture, allowing seamless integration with existing deep learning models, such as recurrent neural networks (RNNs), GRUs, and QuaterNet, in motion modelling tasks.

Dataset Utilization

Experiments are conducted on both the Human3.6M (H3.6M) dataset and AMASS, a more extensive dataset encompassing diverse motion sequences. The latter consists of approximately 42 hours of motion capture data and includes variations far surpassing those in H3.6M, thereby presenting a more challenging and comprehensive benchmark for evaluating motion prediction methodologies.

Evaluation and Metrics

Performance evaluation employs several metrics, including Euler angle differences, pose prediction accuracy using joint angle representation, and positional accuracy via reconstructed 3D joint positions. The SPL consistently improves predictive accuracy across various metrics and datasets, notably outperforming traditional baselines and sequence-to-sequence models on AMASS. The addition of SPL particularly augments the performance in long-term prediction horizons, which underscores the utility of considering spatial structure in motion modelling.

Results and Implications

This work shows that baselines such as zero-velocity and sequence-to-sequence models benefit from incorporating SPL, displaying enhancements in metrics such as Position-wise Correct Keypoint (PCK). The integration of SPL leads to superior performance even when the base models utilize differing joint-angle representations.

The SPL has substantial practical implications in fields requiring human motion prediction, such as activity recognition, human-robot interaction, and pose estimation for autonomous vehicles. The findings suggest that future developments in AI-driven motion prediction should continue to explore structured prediction approaches, emphasizing spatial dependencies inherent to the tasks.

In summary, the introduction of the structured prediction layer marks a significant stride in predicting human motion by addressing the spatial symbiosis among joints, thereby pushing the envelope for more realistic and accurate motion modelling. Future research should continue to refine these methodologies, potentially exploring adaptive learning architectures that further enhance performance across even broader datasets and applications.