Deep, Convolutional, and Recurrent Models for Human Activity Recognition using Wearables (1604.08880v1)

Published 29 Apr 2016 in cs.LG, cs.AI, cs.HC, and stat.ML

Abstract: Human activity recognition (HAR) in ubiquitous computing is beginning to adopt deep learning to substitute for well-established analysis techniques that rely on hand-crafted feature extraction and classification techniques. From these isolated applications of custom deep architectures it is, however, difficult to gain an overview of their suitability for problems ranging from the recognition of manipulative gestures to the segmentation and identification of physical activities like running or ascending stairs. In this paper we rigorously explore deep, convolutional, and recurrent approaches across three representative datasets that contain movement data captured with wearable sensors. We describe how to train recurrent approaches in this setting, introduce a novel regularisation approach, and illustrate how they outperform the state-of-the-art on a large benchmark dataset. Across thousands of recognition experiments with randomly sampled model configurations we investigate the suitability of each model for different tasks in HAR, explore the impact of hyperparameters using the fANOVA framework, and provide guidelines for the practitioner who wants to apply deep learning in their problem setting.

Citations (856)

View on Semantic Scholar

Summary

The paper systematically evaluates deep, convolutional, and recurrent networks for HAR using wearable sensors, demonstrating superior results compared to traditional methods.
It finds that bi-directional LSTMs excel in temporal tasks with F1-scores up to 0.745, while CNNs provide robust performance for repetitive activity patterns.
The study emphasizes the critical role of hyperparameter tuning, such as adjusting learning rates in CNNs and carry-over probabilities in LSTM-based models.

Deep, Convolutional, and Recurrent Models for Human Activity Recognition using Wearables

Introduction

The paper "Deep, Convolutional, and Recurrent Models for Human Activity Recognition using Wearables" by Hammerla et al. provides a systematic evaluation of state-of-the-art deep learning approaches within the field of human activity recognition (HAR) using wearable sensors. This exploration is pivotal given the inertia in adopting deep learning techniques in this field, which has traditionally relied on manual feature extraction and basic classification methods.

Methodology

The research rigorously investigates deep, convolutional, and recurrent models across three representative datasets to benchmark their effectiveness in HAR. The datasets encompass a spectrum of activities, from manipulative gestures to repetitive motions and medical applications. The main models investigated include deep feed-forward networks (DNN), convolutional neural networks (CNN), and recurrent neural networks (RNN), specifically Long Short-Term Memory (LSTM) variants.

Deep Feed-Forward Networks (DNN): Implemented up to five hidden layers, featuring ReLU activation and dropout and max-in norm regularization.
CNNs: Examined with varying configurations of convolutional layers, kernel widths, and feature maps. Applied dropout after pooling layers and utilized a softmax layer for classification.
RNNs: Both forward and bi-directional LSTM networks were implemented. An innovation in training RNNs was introduced through a novel regularization method involving a carry-over probability ( $p_{\text{carry}}$ ), which mitigates the risk of overfitting by resetting internal states based on this probability.

Experimental Setup

The models were evaluated on three datasets: Opportunity (manipulative gestures), PAMAP2 (repetitive physical activities), and Daphnet Gait (medical dataset focused on Parkinson's disease). Each dataset was exhaustively tested with thousands of model configurations explored via random sampling of hyperparameters. The primary evaluation metrics were mean F1-score ( $F_m$ ) and weighted F1-score ( $F_w$ ) to ensure unbiased measurement across class distributions.

Results

The empirical results demonstrated distinct trends and performance characteristics:

Recognition Performance: Recurrent models, particularly the bi-directional LSTM (b-LSTM-S), achieved the highest performance on the Opportunity dataset, outperforming the state-of-the-art by a notable margin. Specifically:
- On Opportunity, b-LSTM-S achieved an $F_m$ of 0.745 and $F_w$ of 0.927.
- On PAMAP2, the CNN model achieved the top performance with an $F_m$ of 0.937.
- On the Daphnet Gait dataset, LSTM-S models attained the highest performance with an $F_m$ of 0.760.
Hyperparameter Influence: The fANOVA analysis revealed that learning-related hyperparameters had the most significant impact on model performance for CNNs. For recurrent models, bi-directional LSTMs were highly sensitive to the number of units per layer, while forward LSTMs benefited most from tuning the carry-over probability.
Model Robustness: CNNs exhibited robustness with a lower variance in performance across different hyperparameters. Conversely, DNNs required extensive hyperparameter tuning, showing a noticeable spread between peak and median performance.

Implications and Future Directions

This paper's findings have important implications for practitioners in the field of HAR:

Recurrent models, particularly LSTM variants, are highly effective for tasks with temporal dependencies and can facilitate real-time HAR applications due to their ability to handle streaming data.
Convolutional networks are recommended for activities that exhibit repetitive patterns over time, demonstrating reliable performance with fewer hyperparameter adjustments.
Practitioners should prioritize hyperparameter tuning for learning rates in CNNs and the number of LSTM units in recurrent models.

The novel regularization method introduced for RNN training, involving resetting internal state probabilistically, addresses potential overfitting issues and empowers the practical application of these models to varied HAR contexts. Future research could focus on refining this technique and exploring its integration with other forms of dynamic regularization.

Overall, while recurrent models show remarkable promise for HAR applications, convolutional networks remain a reliable baseline for more predictable activity patterns. As deep learning techniques continue to evolve, their application in ubiquitous computing and HAR will likely expand, driven by contributions like those presented in this research.

PDF Markdown