Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Detecting Change Intervals with Isolation Distributional Kernel (2212.14630v3)

Published 30 Dec 2022 in cs.LG

Abstract: Detecting abrupt changes in data distribution is one of the most significant tasks in streaming data analysis. Although many unsupervised Change-Point Detection (CPD) methods have been proposed recently to identify those changes, they still suffer from missing subtle changes, poor scalability, or/and sensitivity to outliers. To meet these challenges, we are the first to generalise the CPD problem as a special case of the Change-Interval Detection (CID) problem. Then we propose a CID method, named iCID, based on a recent Isolation Distributional Kernel (IDK). iCID identifies the change interval if there is a high dissimilarity score between two non-homogeneous temporal adjacent intervals. The data-dependent property and finite feature map of IDK enabled iCID to efficiently identify various types of change-points in data streams with the tolerance of outliers. Moreover, the proposed online and offline versions of iCID have the ability to optimise key parameter settings. The effectiveness and efficiency of iCID have been systematically verified on both synthetic and real-world datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Aurenhammer, F. (1991). Voronoi diagrams—A survey of a fundamental geometric data structure.  ACM Computing Surveys, 23(3), 345–405.
  2. Detection of abrupt changes: theory and application, Vol. 104. prentice Hall Englewood Cliffs.
  3. Applied multidimensional scaling. Springer Science & Business Media.
  4. Integrating structured biological data by Kernel Maximum Mean Discrepancy.  Bioinformatics, 22(14), 49–57.
  5. Box, G. (2013). Box and jenkins: time series analysis, forecasting and control.  In A Very British Affair, pp. 161–215. Springer.
  6. Propagation of uncertainty in bayesian kernel models-application to multiple-step ahead forecasting.  In 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing., Vol. 2, pp. II–701. IEEE.
  7. Joint segmentation of multivariate time series with hidden process regression for human activity recognition.  Neurocomputing, 120, 633–644.
  8. Kernel change-point detection with auxiliary deep generative models.  In International Conference on Learning Representations.
  9. Learning phrase representations using rnn encoder-decoder for statistical machine translation.  arXiv preprint arXiv:1406.1078.
  10. Csiszár, I. (1975). I-divergence geometry of probability distributions and minimization problems.  The annals of probability, 146–158.
  11. Espresso: Entropy and shape aware time-series segmentation for processing heterogeneous sensor data.  Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 4(3), 1–24.
  12. Time series change point detection with self-supervised contrastive predictive coding.  In Proceedings of the Web Conference 2021, pp. 3124–3135.
  13. Experimental comparison and survey of twelve time series anomaly detection algorithms.  Journal of Artificial Intelligence Research, 72, 849–899.
  14. Domain agnostic online semantic segmentation for multi-dimensional time series.  Data mining and knowledge discovery, 33(1), 96–130.
  15. Gini, C. (1912). Variabilità e mutabilità: contributo allo studio delle distribuzioni e delle relazioni statistiche.[Fasc. I.]. Tipogr. di P. Cuppini.
  16. Hasc challenge: gathering large scale human activity corpus for the real-world activity understandings.  In Proceedings of the 2nd augmented human international conference, pp. 1–5.
  17. Hasc2011corpus: towards the common ground of human activity recognition.  In Proceedings of the 13th International conference on Ubiquitous computing, pp. 571–572.
  18. Change-point detection in time-series data by direct density-ratio estimation.  In Proceedings of the 2009 SIAM international conference on data mining, pp. 389–400. SIAM.
  19. Curse of Dimensionality, pp. 314–315. Springer US, Boston, MA.
  20. Modeling long-and short-term temporal patterns with deep neural networks.  In The 41st international ACM SIGIR conference on research & development in information retrieval, pp. 95–104.
  21. M-statistic for kernel change-point detection.  Advances in Neural Information Processing Systems, 28.
  22. Change-point detection in time-series data by relative density-ratio estimation.  Neural Networks, 43, 72–83.
  23. Distributed detection/localization of change-points in high-dimensional network traffic data.  Statistics and Computing, 22(2), 485–496.
  24. Foundations of statistical natural language processing. MIT press.
  25. A nonparametric approach for multiple change point analysis of multivariate data.  Journal of the American Statistical Association, 109(505), 334–345.
  26. Kernel mean embedding of distributions: A review and beyond.  Foundations and Trends® in Machine Learning, 10(1-2), 1–141.
  27. Numerical Bayesian Methods Applied to Signal Processing. Springer.
  28. Pincus, S. M. (1991). Approximate entropy as a measure of system complexity..  Proceedings of the National Academy of Sciences, 88(6), 2297–2301.
  29. Nearest-neighbour-induced isolation similarity and its impact on density-based clustering.. Vol. 33, pp. 4755–4762.
  30. Gaussian process change point models.  In Proceedings of the 27th International Conference on International Conference on Machine Learning, pp. 927–934.
  31. Information gain-based metric for recognizing transitions in human activities.  Pervasive and Mobile Computing, 38, 92–109.
  32. A new distributional treatment for time series and an anomaly detection investigation.  Proceedings of the VLDB Endowment, 15(11), 2321–2333.
  33. Point-set kernel clustering.  IEEE Transactions on Knowledge and Data Engineering.
  34. Isolation distributional kernel: A new tool for kernel based anomaly detection.  In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 198–206.
  35. Isolation distributional kernel a new tool for point & group anomaly detection.  IEEE Transactions on Knowledge and Data Engineering.
  36. Selective review of offline change point detection methods.  Signal Processing, 167, 107299.
  37. An evaluation of change point detection algorithms.  arXiv preprint arXiv:2003.06222.
  38. Vaserstein, L. N. (1969). Markov processes over denumerable products of spaces, describing large systems of automata.  Problemy Peredachi Informatsii, 5(3), 64–72.
  39. Using the Nyström method to speed up kernel machines.  In Leen, T. K., Dietterich, T. G., & Tresp, V. (Eds.), Advances in Neural Information Processing Systems 13, pp. 682–688.
  40. Current time series anomaly detection benchmarks are flawed and are creating the illusion of progress.  IEEE Transactions on Knowledge and Data Engineering.
  41. Change-point detection with feature selection in high-dimensional time-series data.  In Twenty-Third International Joint Conference on Artificial Intelligence, pp. 1827–1833.
  42. Usc-had: A daily activity dataset for ubiquitous activity recognition using wearable sensors.  In Proceedings of the 2012 ACM Conference on Ubiquitous Computing, pp. 1036–1043.
Citations (1)

Summary

We haven't generated a summary for this paper yet.