Transfer learning for time series classification (1811.01533v1)

Published 5 Nov 2018 in cs.LG, cs.AI, and stat.ML

Abstract: Transfer learning for deep neural networks is the process of first training a base network on a source dataset, and then transferring the learned features (the network's weights) to a second network to be trained on a target dataset. This idea has been shown to improve deep neural network's generalization capabilities in many computer vision tasks such as image recognition and object localization. Apart from these applications, deep Convolutional Neural Networks (CNNs) have also recently gained popularity in the Time Series Classification (TSC) community. However, unlike for image recognition problems, transfer learning techniques have not yet been investigated thoroughly for the TSC task. This is surprising as the accuracy of deep learning models for TSC could potentially be improved if the model is fine-tuned from a pre-trained neural network instead of training it from scratch. In this paper, we fill this gap by investigating how to transfer deep CNNs for the TSC task. To evaluate the potential of transfer learning, we performed extensive experiments using the UCR archive which is the largest publicly available TSC benchmark containing 85 datasets. For each dataset in the archive, we pre-trained a model and then fine-tuned it on the other datasets resulting in 7140 different deep neural networks. These experiments revealed that transfer learning can improve or degrade the model's predictions depending on the dataset used for transfer. Therefore, in an effort to predict the best source dataset for a given target dataset, we propose a new method relying on Dynamic Time Warping to measure inter-datasets similarities. We describe how our method can guide the transfer to choose the best source dataset leading to an improvement in accuracy on 71 out of 85 datasets.

Authors (5)

Hassan Ismail Fawaz (12 papers)
Germain Forestier (23 papers)
Jonathan Weber (15 papers)
Lhassane Idoumghar (14 papers)
Pierre-Alain Muller (9 papers)

Citations (237)

View on Semantic Scholar

Summary

Transfer Learning for Time Series Classification

In this paper, the authors present an extensive investigation into the application of transfer learning within the domain of Time Series Classification (TSC) using deep Convolutional Neural Networks (CNNs). The research aims to address a notable gap: despite the demonstrated efficacy of transfer learning in other areas such as computer vision, it remains underexplored for TSC tasks.

Summary of Key Findings

The paper commences by underscoring the potential benefits of transfer learning in improving the generalization capabilities of deep learning models. This is particularly pertinent in scenarios where labeled training data is scarce—a common constraint in TSC. The authors conduct an exhaustive experimental evaluation involving the transfer learning of deep CNNs across 85 datasets from the UCR archive, yielding over 7140 trained models.

The results indicate that transfer learning can indeed enhance model accuracy in TSC, contingent on the choice of source and target datasets. Specifically, the research shows improvements in accuracy from transfer learning in 71 out of the 85 datasets. However, the wrong combination of source and target datasets can also degrade performance.

To mitigate the risk of negative transfer, the paper proposes a novel method based on Dynamic Time Warping (DTW) to compute inter-dataset similarities. This approach guides the selection of a suitable source dataset, ensuring improved accuracy gains in the majority of cases. The DTW-based method demonstrated its effectiveness in identifying compatible datasets, achieving better than random results in selecting optimal source datasets for transfer learning tasks.

Implications and Future Directions

The paper highlights several implications for both practitioners and researchers within the field of TSC:

Practical Implications: Practitioners should consider leveraging pre-trained models on related time series datasets instead of training models from scratch, particularly for datasets with limited training data. The use of DTW as a similarity measure provides a practical mechanism for selecting appropriate source datasets, thus facilitating more reliable transfer learning applications.
Research Implications: The paper opens avenues for further exploration into the domain of transfer learning for TSC. The relationship between time series similarity measures, such as DTW, and the features learned by CNNs is a promising area for future investigations, potentially leading to more sophisticated models and methodologies.
Theoretical Implications: From a theoretical standpoint, this work underscores the challenges and opportunities in transferring knowledge between time series with different characteristics. It prompts a deeper inquiry into the theoretical underpinnings of transfer learning in this temporal domain, paving the way for advancements in the understanding of learned feature representations and their transferability.
Technical Developments: On a technical level, the deployment of pre-trained models may significantly reduce computational costs and training times, making TSC more accessible for large-scale applications. Future work could focus on optimizing the transfer learning process itself, such as by refining DTW parameters or integrating other machine learning techniques.

In conclusion, while the paper marks a significant step forward in applying transfer learning to TSC, it also encourages a more refined methodological approach to enhance model accuracy and reliability through smart dataset selection. These findings bear potential implications across various domains reliant on time series data, including finance, healthcare, and environmental studies. As deep learning continues to evolve, the intersection of transfer learning and TSC is poised for noteworthy advancements.

PDF Markdown

Transfer learning for time series classification (1811.01533v1)

Summary

Transfer Learning for Time Series Classification

Summary of Key Findings

Implications and Future Directions

Related Papers