Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 82 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 40 tok/s Pro
GPT-5 High 38 tok/s Pro
GPT-4o 96 tok/s Pro
Kimi K2 185 tok/s Pro
GPT OSS 120B 465 tok/s Pro
Claude Sonnet 4 30 tok/s Pro
2000 character limit reached

Joint Temporal Pooling for Improving Skeleton-based Action Recognition (2408.09356v1)

Published 18 Aug 2024 in cs.CV

Abstract: In skeleton-based human action recognition, temporal pooling is a critical step for capturing spatiotemporal relationship of joint dynamics. Conventional pooling methods overlook the preservation of motion information and treat each frame equally. However, in an action sequence, only a few segments of frames carry discriminative information related to the action. This paper presents a novel Joint Motion Adaptive Temporal Pooling (JMAP) method for improving skeleton-based action recognition. Two variants of JMAP, frame-wise pooling and joint-wise pooling, are introduced. The efficacy of JMAP has been validated through experiments on the popular NTU RGB+D 120 and PKU-MMD datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Channel-wise topology refinement graph convolution for skeleton-based action recognition. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 13339–13348, 2021.
  2. Independently recurrent neural network (indrnn): Building a longer and deeper rnn. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5457–5466, 2018.
  3. Spatial temporal graph convolutional networks for skeleton-based action recognition, 2018.
  4. Actional-structural graph convolutional networks for skeleton-based action recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
  5. Hierarchical recurrent neural network for skeleton based action recognition. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1110–1118, 2015.
  6. Skeleton-based action recognition with spatial reasoning and temporal stack learning. In Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss, editors, Computer Vision – ECCV 2018, pages 106–121, Cham, 2018. Springer International Publishing.
  7. Skeleton-based action recognition with shift graph convolutional network. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 180–189, 2020.
  8. Disentangling and unifying graph convolutions for skeleton-based action recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 143–152, 2020.
  9. Mgsampler: An explainable sampling strategy for video action recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 1513–1522, October 2021.
  10. A learnable motion preserving pooling for fine-grained video classification. In Available at SSRN: https://ssrn.com/abstract=4204770 or http://dx.doi.org/10.2139/ssrn.4204770, 2022.
  11. Semi-supervised classification with graph convolutional networks. In arXiv preprint arXiv:1609.02907, 2017.
  12. Rgb-d-based human motion recognition with deep learning: A survey, 2018.
  13. Depth pooling based large-scale 3d action recognition with convolutional neural networks, 2018.
  14. Skeleton-based action recognition with directed graph neural networks. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, jun 2019.
  15. Spatial residual layer and dense connection block enhanced spatial temporal graph convolutional network for skeleton-based action recognition. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pages 1740–1748, 2019.
  16. Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12018–12027, 2019.
  17. A central difference graph convolutional operator for skeleton-based action recognition. IEEE Transactions on Circuits and Systems for Video Technology, 32(7):4893–4899, 2022.
  18. Improved shift graph convolutional network for action recognition with skeleton. IEEE Signal Processing Letters, 30:438–442, 2023.
  19. Human action recognition using factorized spatio-temporal convolutional networks. In 2015 IEEE International Conference on Computer Vision (ICCV). IEEE, dec 2015.
  20. Selective feature compression for efficient activity recognition inference. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 13608–13617, 2021.
  21. Vidtr: Video transformer without convolutions. 1357.
  22. Temporal segment networks: Towards good practices for deep action recognition. In European conference on computer vision, pages 20–36. Springer, 2016.
  23. Adaframe: Adaptive frame selection for fast video recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
  24. Scsampler: Sampling salient clips from video for efficient action recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019.
  25. Ntu rgb+d 120: A large-scale benchmark for 3d human activity understanding. In IEEE Transactions on Pattern Analysis and Machine Intelligence. Institute of Electrical and Electronics Engineers (IEEE), 2020.
  26. Pku-mmd: A large scale benchmark for continuous multi-modal human action understanding. In arXiv preprint arXiv:1703.07475, 2017.
  27. Multimodal fusion via teacher-student network for indoor action recognition. In The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21), 3199.
  28. Semantics-guided neural networks for efficient skeleton-based human action recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
  29. Making the invisible visible: Action recognition through walls and occlusions. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019.
  30. Skeleton based action recognition with convolutional neural network. In 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), pages 579–583. IEEE, 2015.
  31. Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation. AAAI Press, 2018.
  32. Spatial temporal graph deconvolutional network for skeleton-based human action recognition. IEEE Signal Processing Letters, 28:244–248, 2021.
  33. GAS-GCN: Gated action-specific graph convolutional networks for skeleton-based action recognition. Sensors, 20(12):3499, jun 2020.
  34. Multi-scale spatial temporal graph convolutional network for skeleton-based action recognition. AAAI, 2021.
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.