Experimental Comparison of Representation Methods and Distance Measures for Time Series Data (1012.2789v1)

Published 9 Dec 2010 in cs.AI

Abstract: The previous decade has brought a remarkable increase of the interest in applications that deal with querying and mining of time series data. Many of the research efforts in this context have focused on introducing new representation methods for dimensionality reduction or novel similarity measures for the underlying data. In the vast majority of cases, each individual work introducing a particular method has made specific claims and, aside from the occasional theoretical justifications, provided quantitative experimental observations. However, for the most part, the comparative aspects of these experiments were too narrowly focused on demonstrating the benefits of the proposed methods over some of the previously introduced ones. In order to provide a comprehensive validation, we conducted an extensive experimental study re-implementing eight different time series representations and nine similarity measures and their variants, and testing their effectiveness on thirty-eight time series data sets from a wide variety of application domains. In this paper, we give an overview of these different techniques and present our comparative experimental findings regarding their effectiveness. In addition to providing a unified validation of some of the existing achievements, our experiments also indicate that, in some cases, certain claims in the literature may be unduly optimistic.

Citations (834)

View on Semantic Scholar

Summary

The paper rigorously compares eight representation techniques and nine similarity measures across 38 diverse time series datasets, challenging optimistic claims in previous studies.
Methodological evaluations reveal that spectral methods like DFT excel on periodic data while APCA performs better on bursty datasets, yet no single method dominates overall.
The study debunks misconceptions about DTW's efficiency and accuracy, emphasizing the need for reproducible testing and balanced parameter tuning in time series analysis.

Experimental Comparison of Representation Methods and Distance Measures for Time Series Data

The paper presents an extensive experimental paper comparing various representation methods and distance measures for time series data. The authors re-implemented eight time series representation techniques and nine similarity measures, rigorously evaluating their performance on a diverse set of thirty-eight time series data sets from multiple application domains. This comprehensive evaluation aimed to validate previous claims in the literature and to uncover potentially optimistic assertions about these methods.

Representation Methods for Time Series

The paper investigated the efficacy of major time series representation techniques including Discrete Fourier Transformation (DFT), Discrete Cosine Transformation (DCT), Discrete Wavelet Transformation (DWT), Piecewise Aggregate Approximation (PAA), Adaptive Piecewise Constant Approximation (APCA), Chebyshev polynomials (CHEB), Symbolic Aggregate approXimation (SAX), and Indexable Piecewise Linear Approximation (IPLA).

One significant finding from the experiments was that the difference in tightness of lower bounding, which correlates with pruning power and indexing effectiveness, among these representations was minimal. On diverse data sets, the most notable variations emerged only under specific conditions. For example, spectral methods like DFT were superior on highly periodic data sets, whereas APCA performed better on bursty data sets. However, the consensus suggested that no single representation consistently outperformed others across all scenarios.

Similarity Measures for Time Series

The similarity measures evaluated included Euclidean distance (ED), L1-norm, L∞-norm, Dynamic Time Warping (DTW), Edit Distance with Real Penalty (ERP), Edit Distance on Real sequence (EDR), Longest Common Subsequence (LCSS), DISSIM, Sequence Weighted Alignment model (Swale), Threshold Queries (TQuEST), and Spatial Assembling Distance (SpADe).

The results indicated nuanced differences in accuracy, heavily influenced by dataset attributes and the size of the training set. The primary conclusions were:

Elastic Measures vs. Lock-Step Measures: DTW and its constrained variant were generally more accurate than Euclidean distance, especially on smaller datasets. However, as the size of the training set increased, the accuracy of Euclidean distance converged with that of elastic measures.
Edit Distance-Based Measures: Measures like LCSS, EDR, and ERP showed accuracy comparable to DTW but did not significantly outperform the older method.
Novel Similarity Measures: TQuEST and SpADe appeared inferior to elastic measures overall, despite some dataset-specific improvements.

Insights on DTW and Other Elastic Measures

The paper also addressed several misconceptions about DTW:

Computational Efficiency: The belief that DTW is too slow for practical applications was dispelled by demonstrating that, with appropriate optimizations like early abandoning and lower bounding (e.g., LB Keogh), DTW could perform sufficiently quickly, even on large datasets.
Future Potential for Speed-Up: The authors showed that further significant speed improvements in DTW through tighter lower bounds are unlikely. They argued that the improvements seen in some newer methods might stem from biases or insufficiently rigorous testing procedures.
Accuracy Claims: Through a methodical examination, the paper debunked claims that newer measures surpass DTW in accuracy, suggesting that observed improvements might result from parameter overfitting or non-blind testing procedures.

Implications and Future Directions

This comprehensive evaluation provides a critical benchmark for researchers, emphasizing the necessity of rigorous, reproducible testing, and caution against overfitting and parameter tuning on test sets. The results also encourage further investigation into the characteristics of datasets that might influence the effectiveness of different similarity measures, potentially leading to more adaptive and robust approaches for time series analysis.

Conclusion

The paper presents a thorough audit of time series representation methods and similarity measures, challenging some of the optimistic claims found in prior literature and providing a solid foundation for future research. The results advocate for a balanced and methodical approach to evaluating new techniques, ensuring robust, application-agnostic performance enhancements rather than dataset-specific optimizations.

PDF Markdown