Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fast dynamic time warping and clustering in C++ (2307.04904v1)

Published 10 Jul 2023 in eess.SP, cs.LG, cs.SY, and eess.SY

Abstract: We present an approach for computationally efficient dynamic time warping (DTW) and clustering of time-series data. The method frames the dynamic warping of time series datasets as an optimisation problem solved using dynamic programming, and then clusters time series data by solving a second optimisation problem using mixed-integer programming (MIP). There is also an option to use k-medoids clustering for increased speed, when a certificate for global optimality is not essential. The improved efficiency of our approach is due to task-level parallelisation of the clustering alongside DTW. Our approach was tested using the UCR Time Series Archive, and was found to be, on average, 33% faster than the next fastest option when using the same clustering method. This increases to 64% faster when considering only larger datasets (with more than 1000 time series). The MIP clustering is most effective on small numbers of longer time series, because the DTW computation is faster than other approaches, but the clustering problem becomes increasingly computationally expensive as the number of time series to be clustered increases.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (11)
  1. S. Aghabozorgi, A. Seyed Shirkhorshidi, and T. Ying Wah, “Time-series clustering - A decade review,” Information Systems, vol. 53, pp. 16–38, 5 2015.
  2. H. Sakoe and S. Chiba, “Dynamic Programming Algorithm Optimization for Spoken Word Recognition,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 26, no. 1, pp. 43–49, 1978.
  3. N. Begum, L. Ulanova, J. Wang, and E. Keogh, “Accelerating dynamic time warping clustering with a novel admissible pruning strategy,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, vol. 2015-August.   Association for Computing Machinery, 8 2015, pp. 49–58.
  4. J. R. Ausmus, P. K. Sen, T. Wu, U. Adhikari, Y. Zhang, and V. Krishnan, “Improving the Accuracy of Clustering Electric Utility Net Load Data using Dynamic Time Warping,” in Proceedings of the IEEE Power Engineering Society Transmission and Distribution Conference, vol. 2020-October.   Institute of Electrical and Electronics Engineers Inc., 10 2020.
  5. C. A. Ratanamahatana and E. Keogh, “Three Myths about Dynamic Time Warping Data Mining.”   Proceedings of the 2005 SIAM International Conference on Data Mining (SDM), 2005. [Online]. Available: https://epubs.siam.org/terms-privacy
  6. A. Rajabi, M. Eskandari, M. J. Ghadi, L. Li, J. Zhang, and P. Siano, “A comparative study of clustering techniques for electrical load pattern segmentation,” Renewable and Sustainable Energy Reviews, vol. 120, 3 2020.
  7. W. Meert, “Dtaidistance,” Zenodo, 2020.
  8. R. Tavenard, J. Faouzi, G. Vandewiele, F. Divo, G. Androz, C. Holtz, M. Payne, R. Yurchak, and M. Rußwurm, “Tslearn, A Machine Learning Toolkit for Time Series Data,” Tech. Rep., 2020. [Online]. Available: https://github.com/tslearn-team/tslearn.
  9. D. Deriso and S. Boyd, “A general optimization framework for dynamic time warping,” Optimization and Engineering, pp. 1–22, 2022.
  10. B. Schmidt and C. Hundt, “CuDTW++: Ultra-Fast Dynamic Time Warping on CUDA-Enabled GPUs,” in Euro-Par 2020: Parallel Processing: 26th International Conference on Parallel and Distributed Computing, Warsaw, Poland, August 24–28, 2020, Proceedings.   Berlin, Heidelberg: Springer-Verlag, 2020, pp. 597–612. [Online]. Available: https://doi.org/10.1007/978-3-030-57675-2_37
  11. H. A. Dau, E. Keogh, K. Kamgar, C.-C. M. Yeh, Y. Zhu, S. Gharghabi, C. A. Ratanamahatana, Yanping, B. Hu, N. Begum, A. Bagnall, A. Mueen, and G. Batista, “The UCR Time Series Classification Archive,” 10 2018. [Online]. Available: https://www.cs.ucr.edu/~eamonn/time_series_data_2018/
Citations (1)

Summary

We haven't generated a summary for this paper yet.