Papers
Topics
Authors
Recent
2000 character limit reached

Human Action Recognition System using Good Features and Multilayer Perceptron Network

Published 22 Aug 2017 in cs.CV, cs.AI, and cs.HC | (1708.06794v1)

Abstract: Human action recognition involves the characterization of human actions through the automated analysis of video data and is integral in the development of smart computer vision systems. However, several challenges like dynamic backgrounds, camera stabilization, complex actions, occlusions etc. make action recognition in a real time and robust fashion difficult. Several complex approaches exist but are computationally intensive. This paper presents a novel approach of using a combination of good features along with iterative optical flow algorithm to compute feature vectors which are classified using a multilayer perceptron (MLP) network. The use of multiple features for motion descriptors enhances the quality of tracking. Resilient backpropagation algorithm is used for training the feedforward neural network reducing the learning time. The overall system accuracy is improved by optimizing the various parameters of the multilayer perceptron network.

Citations (9)

Summary

  • The paper presents a HAR system that efficiently extracts motion features and classifies actions with a multilayer perceptron network.
  • It leverages iterative optical flow and refined corner detection to achieve over 92% recognition accuracy on the KTH dataset with limited resources.
  • Simulation results confirm the system's robustness for real-world applications, including surveillance, sign language interpretation, and search and rescue.

Human Action Recognition System using Good Features and Multilayer Perceptron Network

The paper by Jonti Talukdar and Bhavana Mehta presents an innovative approach to Human Action Recognition (HAR) by utilizing a combination of selected features and an iterative optical flow algorithm, with classification performed using a Multilayer Perceptron (MLP) network. HAR is an essential component in the enhancement of smart computer vision systems, finding applications in video surveillance, sign language interpretation, and search and rescue operations. This research addresses the critical challenges faced by HAR systems, notably the complexity and computational intensity of previous methods that rely heavily on local motion descriptors and pose estimation techniques.

Methodology and Technical Contributions

The proposed system leverages "good features" to track in conjunction with an iterative optical flow algorithm to generate motion descriptors that feed into an MLP network for classification. This novel configuration aids in overcoming the traditional computational burden and facilitates real-time deployment using limited computational resources, such as a single-board computer.

  1. Feature Extraction: The utilization of 'good features' for motion description, which is a refined corner detection algorithm, allows for enhanced tracking quality. This is critical as the quality and uniqueness of tracked features directly impact the accuracy of the HAR system. The iterative optical flow algorithm complements this by efficiently tracking the strongest motion features across dynamic sequences, ensuring that even when some features are occluded, the system remains robust by leveraging the surrounding pixel information.
  2. Classification with MLP: The feature vectors are classified using an MLP network, which is trained using resilient backpropagation to improve learning efficiency. The system optimizes parameters such as the number of feature vectors, hidden nodes in the MLP, and total training samples to enhance classification accuracy.
  3. System Optimization: The paper demonstrates a comprehensive analysis of network parameters to maintain a balance between accuracy and computational efficiency. By adjusting the feature vector size and the structure of the MLP, the system achieves a recognition rate of over 92% for various action classes.

Numerical and Performance Insights

Simulation results on the KTH action dataset indicate that the proposed system effectively distinguishes between actions such as walking, running, boxing, and clapping. A feature vector size of 10 provides an optimal balance, achieving an average accuracy of 91.5% across these action classes, showcasing the system's robustness in real-world scenarios with minimal processing delay. These results are significant in light of the simplicity of the components used—indicating a substantial improvement over previous computationally intensive methods.

Implications and Future Directions

This paper contributes a viable solution to deploying HAR systems in resource-constrained environments without sacrificing accuracy, evidenced by its suitability for real-time implementation on low-cost hardware. The implications for fields requiring immediate action recognition capabilities—such as security and healthcare—are significant.

Future work could explore expanding the range of recognized actions through the extension of the dataset and refining feature extraction techniques to handle increasingly complex environmental interactions. Moreover, integrating deeper learning frameworks that could further abstract motion features could provide even higher recognition rates and broader applicability. Integrating this approach with emerging sensor technologies could also broaden its usage in autonomous systems and smart city infrastructure.

In conclusion, this paper demonstrates a methodologically sound and efficient approach to HAR, offering a practical alternative to computationally intensive methodologies and setting the stage for future advancements in adaptable, real-time action recognition technologies.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.