Papers
Topics
Authors
Recent
Search
2000 character limit reached

PYSKL: Towards Good Practices for Skeleton Action Recognition

Published 19 May 2022 in cs.CV | (2205.09443v1)

Abstract: We present PYSKL: an open-source toolbox for skeleton-based action recognition based on PyTorch. The toolbox supports a wide variety of skeleton action recognition algorithms, including approaches based on GCN and CNN. In contrast to existing open-source skeleton action recognition projects that include only one or two algorithms, PYSKL implements six different algorithms under a unified framework with both the latest and original good practices to ease the comparison of efficacy and efficiency. We also provide an original GCN-based skeleton action recognition model named ST-GCN++, which achieves competitive recognition performance without any complicated attention schemes, serving as a strong baseline. Meanwhile, PYSKL supports the training and testing of nine skeleton-based action recognition benchmarks and achieves state-of-the-art recognition performance on eight of them. To facilitate future research on skeleton action recognition, we also provide a large number of trained models and detailed benchmark results to give some insights. PYSKL is released at https://github.com/kennymckormick/pyskl and is actively maintained. We will update this report when we add new features or benchmarks. The current version corresponds to PYSKL v0.2.

Citations (117)

Summary

  • The paper introduces PYSKL, a unified toolbox implementing six skeleton action recognition algorithms including the novel ST-GCN++ model that achieved 92.6% accuracy on NTURGB+D XSub.
  • It emphasizes robust preprocessing and standardized practices that minimize performance variance among various Graph Convolutional Network approaches.
  • Extensive benchmarking across nine datasets demonstrates that systematic practices can outperform complex architectures in action recognition tasks.

An Overview of PYSKL: Practices for Skeleton Action Recognition

The paper presents PYSKL, a comprehensive open-source toolbox developed using PyTorch, aimed at advancing skeleton-based action recognition. Skeleton action recognition leverages human skeletal data for recognizing actions, offering advantages over traditional modalities like RGB due to its compactness and robustness.

Key Contributions

  1. Diverse Algorithm Support: PYSKL offers a unified framework implementing six skeleton action recognition algorithms, including both GCN and CNN-based methodologies. This integration facilitates easier comparison and benchmarking across approaches.
  2. Introduction of ST-GCN++: An original GCN-based model, ST-GCN++ achieves competitive performance without complex attention mechanisms. It serves as a robust baseline, presenting a simpler model with satisfactory recognition results.
  3. Comprehensive Benchmarking: PYSKL supports training and testing across nine skeleton-based action recognition benchmarks, achieving state-of-the-art results on eight.
  4. Robust Preprocessing and Practices: The toolbox emphasizes good practices encompassing data preprocessing, augmentations, and hyperparameter settings, contributing significantly to performance improvements.

Technical Advancements

GCN Approaches

The paper emphasizes Graph Convolutional Networks (GCNs) for processing skeleton data, a popular approach since ST-GCN's introduction. Over time, enhancements like improved graph topologies and auxiliary task integration have pushed performance. PYSKL demonstrates minimal performance deviation among various GCN models due to the strong impact of consistent preprocessing and training practices.

CNN-Based Approach

The CNN paradigm, specifically PoseC3D, utilizes 3D-CNNs for processing sequences as pseudo-images or video clips. These techniques offer robustness but at the cost of computational efficiency compared to GCN methods.

Numerical Results

PYSKL's benchmarking shows impressive results. For instance, on the NTURGB+D XSub benchmark, ST-GCN++ with new practices achieves a 92.6% recognition accuracy, slightly surpassing the previous state-of-the-art CTR-GCN. These results underscore the significance of consistent practices over complex model architectures.

Implications and Future Directions

Practically, PYSKL facilitates streamlined comparisons of skeleton-based action recognition methods and accelerates research by providing pre-trained models and detailed benchmarks. Theoretically, the work suggests that while architectural innovations contribute to performance, systematic practices are pivotal, an insight beneficial for future architecture design.

Looking forward, PYSKL's methodologies could expand to accommodate multi-modality inputs beyond skeletal data, further enhancing action recognition capabilities. Additionally, exploring lighter models with comparable accuracy could address computational concerns, especially for real-time applications.

In conclusion, PYSKL represents a substantial step in unifying and enhancing skeleton action recognition research, providing valuable tools and insights for the community. The open-source nature and ongoing updates promise to continually refine and advance the field.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.