Fast and Accurate Time Series Classification with WEASEL (1701.07681v1)

Published 26 Jan 2017 in cs.DS, cs.LG, and stat.ML

Abstract: Time series (TS) occur in many scientific and commercial applications, ranging from earth surveillance to industry automation to the smart grids. An important type of TS analysis is classification, which can, for instance, improve energy load forecasting in smart grids by detecting the types of electronic devices based on their energy consumption profiles recorded by automatic sensors. Such sensor-driven applications are very often characterized by (a) very long TS and (b) very large TS datasets needing classification. However, current methods to time series classification (TSC) cannot cope with such data volumes at acceptable accuracy; they are either scalable but offer only inferior classification quality, or they achieve state-of-the-art classification quality but cannot scale to large data volumes. In this paper, we present WEASEL (Word ExtrAction for time SEries cLassification), a novel TSC method which is both scalable and accurate. Like other state-of-the-art TSC methods, WEASEL transforms time series into feature vectors, using a sliding-window approach, which are then analyzed through a machine learning classifier. The novelty of WEASEL lies in its specific method for deriving features, resulting in a much smaller yet much more discriminative feature set. On the popular UCR benchmark of 85 TS datasets, WEASEL is more accurate than the best current non-ensemble algorithms at orders-of-magnitude lower classification and training times, and it is almost as accurate as ensemble classifiers, whose computational complexity makes them inapplicable even for mid-size datasets. The outstanding robustness of WEASEL is also confirmed by experiments on two real smart grid datasets, where it out-of-the-box achieves almost the same accuracy as highly tuned, domain-specific methods.

Citations (227)

View on Semantic Scholar

Summary

The paper presents WEASEL, a method that balances scalability and accuracy in time series classification.
It employs a sliding window Bag-of-Patterns approach to extract compact and highly discriminative feature sets.
Experiments on 85 UCR datasets and smart grid data demonstrate high accuracy with reduced computational demands.

Fast and Accurate Time Series Classification with WEASEL

The research paper presents WEASEL (Word ExtrAction for time SEries cLassification), a novel method for time series classification (TSC) that seeks to address the dual challenges of scalability and accuracy in processing vast and long time series datasets. Unlike existing methods that are either computationally prohibitive or suffer from poor classification quality, WEASEL is designed to balance these requirements effectively. Time series, often arising in contexts such as smart grids and industry automation, necessitate methods that can swiftly and accurately classify large volumes of data, a need WEASEL aims to fulfill.

In essence, WEASEL leverages the Bag-of-Patterns approach enhanced by a sliding window technique to transform time series data into feature vectors. These feature vectors are processed using a machine learning classifier. Its novelty lies in the specific method used for feature extraction, resulting in comparatively smaller, yet more discriminative, feature sets.

On the well-regarded UCR benchmark, consisting of 85 datasets, WEASEL demonstrates a superior accuracy compared to the best current non-ensemble classifiers, significantly reducing classification and training times. Although it performs close to ensemble methods like COTE in terms of accuracy, its advantage lies in reduced computational complexity. In experimental evaluations involving real smart grid datasets, WEASEL achieved accuracy comparable to domain-specific methods without requiring extensive tuning, underscoring its robustness and applicability.

Implications and Future Directions

The WEASEL algorithm introduces several innovations that enhance its applicability in real-world scenarios where both speed and accuracy are paramount. By utilizing a high-dimensional feature space with efficient feature selection, incorporating word co-occurrences, and implementing a supervised symbolic representation, WEASEL is well-equipped to highlight variable-length substructures within time series data that are indicative of different classes. This makes it particularly well-suited for domains characterized by noisy or repetitive structures, including sensor data and image outlines.

Practically, WEASEL's ability to classically improve the prediction of energy consumption through TSC makes it valuable for industries aiming to optimize energy use, potentially leading to substantial cost savings. Theoretically, it reinforces the paradigm of leveraging supervised feature extraction methods to enhance structural pattern recognition, contributing to the broader field of machine learning and pattern recognition.

Looking ahead, the research suggests two potential advancements. First, extending WEASEL to handle multivariate time series could expand its applicability and utility across numerous domains that rely on data from multiple sensors. Second, incorporating strategies to manage time series with variable sampling rates could further increase its versatility and accuracy.

Overall, WEASEL stands out as a compelling solution for fast and accurate time series classification, with promising implications for both practical applications and theoretical advancements in TSC methodologies.

PDF Markdown

Fast and Accurate Time Series Classification with WEASEL (1701.07681v1)

Summary

Fast and Accurate Time Series Classification with WEASEL

Implications and Future Directions

Related Papers