Impact of Three-Dimensional Video Scalability on Multi-View Activity Recognition using Deep Learning (1709.10206v1)

Published 29 Sep 2017 in cs.MM

Abstract: Human activity recognition is one of the important research topics in computer vision and video understanding. It is often assumed that high quality video sequences are available for recognition. However, relaxing such a requirement and implementing robust recognition using videos having reduced data rates can achieve efficiency in storing and transmitting video data. Three-dimensional video scalability, which refers to the possibility of reducing spatial, temporal, and quality resolutions of videos, is an effective way for flexible representation and management of video data. In this paper, we investigate the impact of the video scalability on multi-view activity recognition. We employ both a spatiotemporal feature extraction-based method and a deep learning-based method using convolutional and recurrent neural networks. The recognition performance of the two methods is examined, along with in-depth analysis regarding how their performance vary with respect to various scalability combinations. In particular, we demonstrate that the deep learning-based method can achieve significantly improved robustness in comparison to the feature-based method. Furthermore, we investigate optimal scalability combinations with respect to bitrate in order to provide useful guidelines for an optimal operation policy in resource-constrained activity recognition systems.

Citations (4)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Impact of Three-Dimensional Video Scalability on Multi-View Activity Recognition using Deep Learning (1709.10206v1)

Summary

Related Papers