Papers
Topics
Authors
Recent
Search
2000 character limit reached

Data Curves Clustering Using Common Patterns Detection

Published 5 Jan 2020 in cs.AI | (2001.02095v1)

Abstract: For the past decades we have experienced an enormous expansion of the accumulated data that humanity produces. Daily a numerous number of smart devices, usually interconnected over internet, produce vast, real-values datasets. Time series representing datasets from completely irrelevant domains such as finance, weather, medical applications, traffic control etc. become more and more crucial in human day life. Analyzing and clustering these time series, or in general any kind of curves, could be critical for several human activities. In the current paper, the new Curves Clustering Using Common Patterns (3CP) methodology is introduced, which applies a repeated pattern detection algorithm in order to cluster sequences according to their shape and the similarities of common patterns between time series, data curves and eventually any kind of discrete sequences. For this purpose, the Longest Expected Repeated Pattern Reduced Suffix Array (LERP-RSA) data structure has been used in combination with the All Repeated Patterns Detection (ARPaD) algorithm in order to perform highly accurate and efficient detection of similarities among data curves that can be used for clustering purposes and which also provides additional flexibility and features.

Citations (1)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.