Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Comprehensive Python Library for Deep Learning-Based Event Detection in Multivariate Time Series Data and Information Retrieval in NLP (2310.16485v2)

Published 25 Oct 2023 in cs.LG

Abstract: Event detection in time series data is crucial in various domains, including finance, healthcare, cybersecurity, and science. Accurately identifying events in time series data is vital for making informed decisions, detecting anomalies, and predicting future trends. Despite extensive research exploring diverse methods for event detection in time series, with deep learning approaches being among the most advanced, there is still room for improvement and innovation in this field. In this paper, we present a new deep learning supervised method for detecting events in multivariate time series data. Our method combines four distinct novelties compared to existing deep-learning supervised methods. Firstly, it is based on regression instead of binary classification. Secondly, it does not require labeled datasets where each point is labeled; instead, it only requires reference events defined as time points or intervals of time. Thirdly, it is designed to be robust by using a stacked ensemble learning meta-model that combines deep learning models, ranging from classic feed-forward neural networks (FFNs) to state-of-the-art architectures like transformers. This ensemble approach can mitigate individual model weaknesses and biases, resulting in more robust predictions. Finally, to facilitate practical implementation, we have developed a Python package to accompany our proposed method. The package, called eventdetector-ts, can be installed through the Python Package Index (PyPI). In this paper, we present our method and provide a comprehensive guide on the usage of the package. We showcase its versatility and effectiveness through different real-world use cases from NLP to financial security domains.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. Z. Zamanzadeh Darban, G. I. Webb, S. Pan, C. C. Aggarwal, and M. Salehi, “Deep learning for time series anomaly detection: A survey,” 2022.
  2. T. K. K. Ho, A. Karami, and N. Armanfard, “Graph-based time-series anomaly detection: A survey,” 2023.
  3. S. Aminikhanghahi and D. Cook, “A survey of methods for time series change point detection,” in Knowledge and Information Systems, vol. 51, May 2017. https://doi.org/10.1007/s10115-016-0987-z.
  4. S. Han, X. Hu, H. Huang, M. Jiang, and Y. Zhao, “ADBench: Anomaly Detection Benchmark,” in Advances in Neural Information Processing Systems, vol. 35, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds. Curran Associates, Inc., 2022, pp. 32142-32159.
  5. Y. Zhao, Z. Nasrullah, and Z. Li, “PyOD: A Python Toolbox for Scalable Outlier Detection,” in Journal of Machine Learning Research (JMLR), vol. 20, no. 96, pp. 1-7, 2019.
  6. T. G. Dietterich, “Ensemble methods in machine learning,” in Multiple Classifier Systems, Berlin, Heidelberg, 2000, pp. 1-15. Springer Berlin Heidelberg. ISBN 978-3-540-45014-6.
  7. M. Azib, B. Renard, P. Garnier, V. Génot, and N. André, “Universal Event Detection in Time Series,” 2023, [Online]. Available: https://doi.org/10.31219/osf.io/uabjg.
  8. A. Dal Pozzolo, O. Caelen, Y. Le Borgne, S. Waterschoot, and G. Bontempi, “Learned lessons in credit card fraud detection from a practitioner perspective,” Expert Systems with Applications, vol. 41, no. 10, pp. 4915-4928, 2014.
  9. K. Spärck Jones, “A statistical interpretation of term specificity and its application in retrieval,” J. Documentation, vol. 60, pp. 493-502, 2021.
  10. Y. Zhang, E. Milios, and A. Zincir-Heywood, “A Comparative Study on Key Phrase Extraction Methods in Automatic Web Site Summarization,” JDIM, vol. 5, pp. 323-332, 2007.
  11. S. Lee and H. Kim, “News Keyword Extraction for Topic Tracking,” pp. 554-559, 2008, doi: 10.1109/NCM.2008.199.
  12. Z. Wu and C. L. Giles, “Measuring Term Informativeness in Context,” in Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Atlanta, GA, 2013, pp. 259-269.
  13. J. Rennie and T. Jaakkola, “Using term informativeness for named entity detection,” pp. 353-360, 2005, doi: 10.1145/1076034.1076095.
  14. A. Chiche and B. Yitagesu, “Part of speech tagging: a systematic review of deep learning and machine learning approaches,” Journal of Big Data, vol. 9, no. 10, Jan. 2022.
  15. Wikimedia Foundation, “Wikimedia Downloads,” [Online]. Available: https://dumps.wikimedia.org.
  16. Hugging Face. ‘Wikipedia Dataset.’ [Online]. Available: https://huggingface.co/datasets/wikipedia/viewer/20220301.en/train.
  17. S. Bird, E. Klein, and E. Loper, “Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit”. O’Reilly Media, Inc., 2009.
  18. T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” in Proceedings of Workshop at ICLR, 2013, arXiv:1301.3781v1.
  19. J. Pennington, R. Socher, and C. D. Manning, “GloVe: Global vectors for word representation,” in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp. 1532-1543.
  20. P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching word vectors with subword information,” Transactions of the Association for Computational Linguistics, vol. 5, pp. 135-146, 2017.
  21. I. Sadgali, N. Sael, and F. Benabbou, “Bidirectional gated recurrent unit for improving classification in credit card fraud detection,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 21, pp. 1704-1712, Mar. 2021, doi: 10.11591/ijeecs.v21.i3.pp1704-1712.

Summary

We haven't generated a summary for this paper yet.