Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Enhanced Human Activity Recognition through Natural Language Generation and Pose Estimation (2312.06965v1)

Published 12 Dec 2023 in cs.HC

Abstract: Vision-based human activity recognition (HAR) has made substantial progress in recognizing predefined gestures but lacks adaptability for emerging activities. This paper introduces a paradigm shift by harnessing generative modeling and LLMs to enhance vision-based HAR. We propose utilizing LLMs to generate descriptive textual representations of activities using pose keypoints as an intermediate representation. Incorporating pose keypoints adds contextual depth to the recognition process, allowing for sequences of vectors resembling text chunks, compatible with LLMs. This innovative fusion of computer vision and natural language processing holds significant potential for revolutionizing activity recognition. A proof of concept study on a Kinetics700 dataset subset validates the approach's efficacy, highlighting improved accuracy and interpretability. Future implications encompass enhanced accuracy, novel research avenues, model generalization, and ethical considerations for transparency. This framework has real-world applications, including personalized gym workout feedback and nuanced sports training insights. By connecting visual cues to interpretable textual descriptions, the proposed framework advances HAR accuracy and applicability, shaping the landscape of pervasive computing and activity recognition research. As this approach evolves, it promises a more insightful understanding of human activities across diverse contexts, marking a significant step towards a better world.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (16)
  1. Extracting Meaningful Social Signals Associated with Bias from Patient-Provider Interactions to Improve Patient Care. In 2023 Annual Research Meeting. AcademyHealth.
  2. A Short Note on the Kinetics-700 Human Action Dataset. arXiv:1907.06987 [cs.CV]
  3. Periodic physical activity information segmentation, counting and recognition from video. IEEE Access 11 (2023), 23019–23031.
  4. Sensor-based and vision-based human activity recognition: A comprehensive survey. Pattern Recognition 108 (2020), 107561.
  5. AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time. arXiv:2211.03375 [cs.CV]
  6. Deep learning approaches for workout repetition counting and validation. Pattern Recognition Letters 151 (2021), 259–266.
  7. Real-time feedback on nonverbal clinical communication. Methods of information in medicine 53, 05 (2014), 389–405.
  8. COMET-ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs. In AAAI.
  9. GymCam: Detecting, recognizing and tracking simultaneous exercises in unconstrained scenes. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 4 (2018), 1–17.
  10. Groupformer: Group activity recognition with clustered spatial-temporal transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13668–13677.
  11. Language Models are Unsupervised Multitask Learners. https://api.semanticscholar.org/CorpusID:160025533
  12. Indian classical dance classification by learning dance pose bases. In 2012 IEEE Workshop on the Applications of Computer Vision (WACV). IEEE, 265–270.
  13. Grasping microgestures: Eliciting single-hand microgestures for handheld objects. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–13.
  14. InternVideo: General Video Foundation Models via Generative and Discriminative Learning. arXiv:2212.03191 [cs.CV]
  15. Hulamove: Using commodity imu for waist interaction. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–16.
  16. A review on human activity recognition using vision-based method. Journal of healthcare engineering 2017 (2017).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Nikhil Kashyap (1 paper)
  2. Manas Satish Bedmutha (4 papers)
  3. Prerit Chaudhary (1 paper)
  4. Brian Wood (9 papers)
  5. Wanda Pratt (2 papers)
  6. Janice Sabin (2 papers)
  7. Andrea Hartzler (4 papers)
  8. Nadir Weibel (23 papers)