Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 175 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 38 tok/s Pro
GPT-5 High 37 tok/s Pro
GPT-4o 108 tok/s Pro
Kimi K2 180 tok/s Pro
GPT OSS 120B 447 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

On the number of iterations of the DBA algorithm (2401.05841v1)

Published 11 Jan 2024 in cs.CG

Abstract: The DTW Barycenter Averaging (DBA) algorithm is a widely used algorithm for estimating the mean of a given set of point sequences. In this context, the mean is defined as a point sequence that minimises the sum of dynamic time warping distances (DTW). The algorithm is similar to the $k$-means algorithm in the sense that it alternately repeats two steps: (1) computing an optimal assignment to the points of the current mean, and (2) computing an optimal mean under the current assignment. The popularity of DBA can be attributed to the fact that it works well in practice, despite any theoretical guarantees to be known. In our paper, we aim to initiate a theoretical study of the number of iterations that DBA performs until convergence. We assume the algorithm is given $n$ sequences of $m$ points in $\mathbb{R}d$ and a parameter $k$ that specifies the length of the mean sequence to be computed. We show that, in contrast to its fast running time in practice, the number of iterations can be exponential in $k$ in the worst case - even if the number of input sequences is $n=2$. We complement these findings with experiments on real-world data that suggest this worst-case behaviour is likely degenerate. To better understand the performance of the algorithm on non-degenerate input, we study DBA in the model of smoothed analysis, upper-bounding the expected number of iterations in the worst case under random perturbations of the input. Our smoothed upper bound is polynomial in $k$, $n$ and $d$, and for constant $n$, it is also polynomial in $m$. For our analysis, we adapt the set of techniques that were developed for analysing $k$-means and observe that this set of techniques is not sufficient to obtain tight bounds for general $n$.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. Smoothed analysis of the k-means method. J. ACM, 58(5), 2011.
  2. D. Arthur and S. Vassilvitskii. How slow is the k-means method? In Proc. Symp. on Computational Geometry, SCG ’06, page 144–153. ACM, 2006.
  3. S. Barone and S. Basu. On a real analog of Bezout inequality and the number of connected components of sign conditions. Proc. London Mathematical Society, 112(1):115–145, 2016.
  4. D. J. Berndt and J. Clifford. Using dynamic time warping to find patterns in time series. In KDD workshop, volume 10, pages 359–370, 1994.
  5. Exact mean computation in dynamic time warping spaces. Data Mining and Knowledge Discovery, 33(1):252–291, 2019.
  6. F. Brüning. DBA implementation. GitHub, 2022. https://github.com/FrederikBruening/DBA.
  7. On the hardness of computing an average curve. In 17th Scandinavian Symp. and Workshops on Algorithm Theory, SWAT 2020, volume 162, pages 19:1–19:19, 2020.
  8. Approximating length-restricted means under dynamic time warping. In Approximation and Online Algorithms: 20th WAOA, 2022.
  9. Tight hardness results for consensus problems on circular strings and time series. SIAM Journal on Discrete Mathematics, 34(3):1854–1883, 2020.
  10. The UCR time series classification archive, July 2015. www.cs.ucr.edu/~eamonn/time_series_data/.
  11. Plantinga-Vegter algorithm takes average polynomial time. In Proc. of the 2019 on international symposium on symbolic and algebraic computation, pages 114–121, 2019.
  12. Touch me once and i know it’s you! implicit authentication based on touch screen patterns. Conference on Human Factors in Computing Systems - Proceedings, 05 2012.
  13. Smoothed analysis for the condition number of structured real polynomial systems. Mathematics of Computation, 2021.
  14. Data augmentation using synthetic data for time series classification with deep residual networks. arXiv:1808.02455, 2018.
  15. Classification of surgical processes using dynamic time warping. Journal of biomedical informatics, 45(2):255–264, 2012.
  16. Generating synthetic time series to augment sparse datasets. In 2017 IEEE international conference on data mining (ICDM), pages 865–870, 2017.
  17. Towards 3-d model-based tracking and recognition of human movement: a multi-view approach. In International Workshop on Automatic Face- and Gesture-Recognition. IEEE, pages 272–277, 1995.
  18. S. Har-Peled and B. Sadri. How fast is the k-means method? Algorithmica, 41(3):185–202, 2005.
  19. Variance-based k-clustering algorithms by voronoi diagrams and randomization. IEICE Trans. on Information and Systems, 83(6):1199–1206, 2000.
  20. E. Keogh and C. A. Ratanamahatana. Exact indexing of dynamic time warping. Knowledge and information systems, 7(3):358–386, 2005.
  21. J. Lines and A. Bagnall. Time series classification with ensembles of elastic distance measures. Data Mining and Knowledge Discovery, 29(3):565–592, 2015.
  22. S. Lloyd. Least Squares Quantization in PCM. IEEE Trans. on information theory, 28(2):129–137, 1982.
  23. The M5 competition: Background, organization, and implementation. International Journal of Forecasting, 38(4):1325–1336, 2022. Special Issue: M5 competition.
  24. F. Petitjean. DBA. GitHub repository, 2014. https://github.com/fpetitjean/DBA.git.
  25. Faster and more accurate classification of time series by exploiting a novel dynamic time warping averaging algorithm. Knowledge and Information Systems, 47(1):1–26, 2016.
  26. A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognition, 44(3):678–693, 2011.
  27. H. Sakoe and S. Chiba. Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. on Acoustics, Speech, and Signal Processing, 26:43–49, 1978.
  28. D. Schultz and B. Jain. Nonsmooth analysis and subgradient methods for averaging in dynamic time warping spaces. Pattern Recognition, 74:340–358, 2018.
  29. Multivariate time series classification using dynamic time warping template selection for human activity recognition. In IEEE Symp. on comp. intelligence, pages 1399–1406, 2015.
  30. A. Shanker and A.N. Rajagopalan. Off-line signature verification using DTW. Pattern Recognition Letters, 28:1407–1414, 09 2007.
  31. Shape-based approach to household electric load curve clustering and prediction. IEEE Trans. on Smart Grid, 9(5):5196–5206, 2017.
  32. H. Teichgraeber and A. R. Brandt. Clustering methods to find representative periods for the optimization of energy systems: An initial framework and comparison. Applied energy, 239:1283–1293, 2019.
  33. A. Vattani. k-means requires exponentially many iterations even in the plane. Discrete & Computational Geometry, 45(4):596–616, 2011.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.