Pose Flow: Efficient Online Pose Tracking (1802.00977v2)

Published 3 Feb 2018 in cs.CV and cs.AI

Abstract: Multi-person articulated pose tracking in unconstrained videos is an important while challenging problem. In this paper, going along the road of top-down approaches, we propose a decent and efficient pose tracker based on pose flows. First, we design an online optimization framework to build the association of cross-frame poses and form pose flows (PF-Builder). Second, a novel pose flow non-maximum suppression (PF-NMS) is designed to robustly reduce redundant pose flows and re-link temporal disjoint ones. Extensive experiments show that our method significantly outperforms best-reported results on two standard Pose Tracking datasets by 13 mAP 25 MOTA and 6 mAP 3 MOTA respectively. Moreover, in the case of working on detected poses in individual frames, the extra computation of pose tracker is very minor, guaranteeing online 10FPS tracking. Our source codes are made publicly available(https://github.com/YuliangXiu/PoseFlow).

Citations (309)

View on Semantic Scholar

Summary

The paper introduces an innovative online tracking framework that constructs pose flows using PF-Builder and PF-NMS to link detections across frames.
The paper achieves significant performance gains with improvements of up to 13 mAP and 25 MOTA on PoseTrack datasets while processing at 10 FPS.
The paper enhances robustness in multi-person pose tracking and offers practical applications in sports analytics and video surveillance.

Efficient Online Pose Tracking Using Pose Flow

The paper "Pose Flow: Efficient Online Pose Tracking" addresses the complex issue of multi-person articulated pose tracking within unconstrained video environments. The authors, Xiu et al., from the Machine Vision and Intelligence Group at Shanghai Jiao Tong University, propose a novel approach centered on developing an effective and efficient tracker based on the concept of pose flows.

Methodology and Innovations

The paper focuses on top-down approaches for pose tracking, which involve detecting human figures in video frames and subsequently tracking them. Xiu et al. introduce two key techniques within their framework:

Pose Flow Building (PF-Builder): This technique involves constructing pose flows by associating poses across frames that represent the same individual. The authors propose an optimization framework to maximize overall confidence, ensuring that poses are linked across frames even when detections are inconsistent.
Pose Flow Non-Maximum Suppression (PF-NMS): This technique operates on the temporal scale and helps in merging redundant pose flows while reconnecting fragmented pose flows caused by detection inaccuracies. By treating the entire pose flow as a unit in the NMS process, the approach leverages temporal information effectively, enhancing the stability and accuracy of the tracking.

Experimental Results

The authors conducted thorough experiments on two standard datasets: PoseTrack and PoseTrack Challenge. The results demonstrate notable performance improvements:

The proposed method outperformed prior approaches by achieving higher Mean Average Precision (mAP) and Multi-Object Tracking Accuracy (MOTA). Specifically, it bettered the state-of-the-art results by margins of 13 mAP and 25 MOTA on PoseTrack, and 6 mAP and 3 MOTA on the PoseTrack Challenge dataset.
The approach supports online tracking at 10 frames per second, which is efficient and suitable for real-time applications.

Implications and Future Directions

The paper's contributions offer substantial improvements in tracking robustness and accuracy, making it an essential reference for pose tracking tasks in dynamic and unconstrained settings. The proposed methods, such as PF-Builder and PF-NMS, provide a solid foundation for future advancements in pose estimation and tracking technologies.

In terms of practical implications, the ability to track multiple poses effectively in real-time has a wide range of applications from sports analytics to video surveillance. Theoretically, this work sets the stage for more sophisticated models that can incorporate additional contextual information such as motion trajectories and scene semantics.

Further research could explore integrating these techniques with advanced neural network architectures to handle more complex scenarios, such as heavily occluded environments or scenarios with dense crowds. Moreover, incorporating three-dimensional spatial information could further enhance tracking accuracy and stability.

In sum, Xiu et al.'s work offers valuable insights into enhancing the efficiency and performance of pose tracking systems, marking a significant step forward in the domain of video-based human pose estimation and behavior analysis.

PDF Markdown

Related Papers

GitHub

GitHub - YuliangXiu/PoseFlow: PoseFlow: Efficient Online Pose Tracking (BMVC'18) (435 stars)

YouTube

Show All Videos