Periodicity Detection of Outlier Sequences Using Constraint Based Pattern Tree with MAD (1507.01685v1)
Abstract: Patterns that appear rarely or unusually in the data can be defined as outlier patterns. The basic idea behind detecting outlier patterns is comparison of their relative frequencies with frequent patterns. Their frequencies of appearance are less and thus have lesser support in the data. Detecting outlier patterns is an important data mining task which will reveal some interesting facts. The search for periodicity of patterns gives the behavior of these patterns across time as to when they repeat likely. This in turn helps in prediction of events. These patterns are found in Time series-data, social networks etc. In this paper, an algorithm for periodic outlier pattern detection is proposed with the usage of a Constraint Based FP (Frequent Pattern)-tree as the underlying data structure for time series data. The growth of the tree is limited by using level and monotonic constraints. The protein sequence of bacteria named E.Coli is collected and periodic outlier patterns in the sequence are identified. Further the enhancement of results is obtained by finding the Median Absolute Deviation (MAD) in defining candidate outlier patterns. The comparative results between STNR-out (Suffix Tree Noise Resilient for Outlier Detection) and proposed algorithm are illustrated. The results show the effectiveness and applicability of the proposed algorithm.