Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Novelty Detection in Sequential Data by Informed Clustering and Modeling (2103.03943v2)

Published 5 Mar 2021 in cs.LG

Abstract: Novelty detection in discrete sequences is a challenging task, since deviations from the process generating the normal data are often small or intentionally hidden. Novelties can be detected by modeling normal sequences and measuring the deviations of a new sequence from the model predictions. However, in many applications data is generated by several distinct processes so that models trained on all the data tend to over-generalize and novelties remain undetected. We propose to approach this challenge through decomposition: by clustering the data we break down the problem, obtaining simpler modeling task in each cluster which can be modeled more accurately. However, this comes at a trade-off, since the amount of training data per cluster is reduced. This is a particular problem for discrete sequences where state-of-the-art models are data-hungry. The success of this approach thus depends on the quality of the clustering, i.e., whether the individual learning problems are sufficiently simpler than the joint problem. While clustering discrete sequences automatically is a challenging and domain-specific task, it is often easy for human domain experts, given the right tools. In this paper, we adapt a state-of-the-art visual analytics tool for discrete sequence clustering to obtain informed clusters from domain experts and use LSTMs to model each cluster individually. Our extensive empirical evaluation indicates that this informed clustering outperforms automatic ones and that our approach outperforms state-of-the-art novelty detection methods for discrete sequences in three real-world application scenarios. In particular, decomposition outperforms a global model despite less training data on each individual cluster.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. Baxt, W. G. Use of an artificial neural network for data analysis in clinical decision-making: the diagnosis of acute coronary occlusion. Neural computation, 2(4):480–489, 1990.
  2. Classification-based anomaly detection for general data. In International Conference on Learning Representations, 2019.
  3. Latent dirichlet allocation. Journal of machine Learning research, 3(Jan):993–1022, 2003.
  4. Anomaly detection in large sets of high-dimensional symbol sequences. Technical report, NASA TM-2006-214553, NASA Ames Research Center, 2006.
  5. Buntine, W. Graphical models for discovering knowledge. In Advances in knowledge discovery and data mining, pp. 59–82. AAAI/MIT Press, 1996.
  6. Cadez, I. et al. Visualization of navigation patterns on a web site using model-based clustering. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pp.  280–284, 2000.
  7. Density-based clustering based on hierarchical density estimates. In Pacific-Asia conference on knowledge discovery and data mining, pp.  160–172. Springer, 2013.
  8. Deep learning for anomaly detection: A survey. arXiv preprint arXiv:1901.03407, 2019.
  9. Comparative evaluation of anomaly detection techniques for sequence data. In 2008 Eighth IEEE international conference on data mining, pp.  743–748. IEEE, 2008.
  10. Anomaly detection: A survey. ACM computing surveys (CSUR), 41(3):1–58, 2009.
  11. Anomaly detection for discrete sequences: A survey. IEEE Transactions on Knowledge and Data Engineering, 24(5):823–839, 2010.
  12. Lda ensembles for interactive exploration and categorization of behaviors. IEEE transactions on visualization and computer graphics, 2019.
  13. Scalable auto-encoders for gravitational waves detection from time series data. Expert Systems with Applications, pp.  113378, 2020.
  14. Generation of a new ids test dataset: Time to retire the kdd collection. In 2013 IEEE Wireless Communications and Networking Conference (WCNC), pp.  4487–4492. IEEE, 2013.
  15. A comparative evaluation of novelty detection algorithms for discrete sequences. Artificial Intelligence Review, pp.  1–26, 2019.
  16. Sequence data mining, volume 33. Springer Science & Business Media, 2007.
  17. Applied regression analysis, john wiley and sons. New York, 407, 1981.
  18. Neural networks learning improvement using the k-means clustering algorithm to detect network intrusions. INFOCOMP, 5(3):28–36, 2006.
  19. Pattern-based anomaly detection in mixed-type time series. Lecture Notes in Artificial Intelligence, 2019.
  20. Florez-Larrahondo, G. et al. Efficient modeling of discrete events for anomaly detection using hidden markov models. In International Conference on Information Security, pp. 506–514. Springer, 2005.
  21. Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth. In proceedings of the 17th international conference on data engineering, pp.  215–224. Citeseer, 2001.
  22. Long short-term memory. Neural Comput., 9(8):1735–1780, November 1997. ISSN 0899-7667.
  23. Anomaly detection for visual analytics of power consumption data. Computers & Graphics, 38:27–37, 2014.
  24. Multi-domain learning: when do domains matter? In Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp.  1302–1312. Association for Computational Linguistics, 2012.
  25. Jurafsky, D. Speech & language processing. Pearson Education India, 2000.
  26. Kim, G. et al. Lstm-based system-call language modeling and robust ensemble method for designing host-based intrusion detection systems. arXiv preprint arXiv:1611.01726, 2016.
  27. Evaluating real-time anomaly detection algorithms–the numenta anomaly benchmark. In 2015 IEEE 14th International Conference on Machine Learning and Applications, pp.  38–44. IEEE, 2015.
  28. Visual analytics for event detection: Focusing on fraud. Visual Informatics, 2(4):198–212, 2018.
  29. Isolation forest. In 2008 Eighth IEEE International Conference on Data Mining, pp.  413–422. IEEE, 2008.
  30. Liu, S. et al. Towards better analysis of machine learning models: A visual analytics perspective. Visual Informatics, 1(1):48–56, 2017.
  31. Visualizing data using t-sne. Journal of Machine Learning Research, 9:2579–2605, 2008.
  32. Decomposition methodology for knowledge discovery and data mining. In Data mining and knowledge discovery handbook, pp. 981–1003. Springer, 2005.
  33. A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional lstm neural networks. In 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp.  1996–2000. IEEE, 2015.
  34. hdbscan: Hierarchical density based clustering. The Journal of Open Source Software, 2(11):205, 2017.
  35. What yelp fake review filter might be doing? In Seventh international AAAI conference on weblogs and social media, 2013.
  36. Evaluation of adaptive mixtures of competing experts. In Advances in neural information processing systems, pp. 774–780, 1991.
  37. User modelling for exclusion and anomaly detection: a behavioural intrusion detection system. In International Conference on User Modeling, Adaptation, and Personalization, pp.  207–218. Springer, 2010.
  38. Pavlov, D. Sequence modeling with mixtures of conditional maximum entropy distributions. In International Conference on Data Mining, pp.  251–258. IEEE, 2003.
  39. Structurally adaptive modular networks for nonstationary environments. IEEE Transactions on Neural Networks, 10(1):152–160, 1999.
  40. Ren, H. et al. Time-series anomaly detection service at microsoft. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp.  3009–3017, 2019.
  41. Rousseeuw, P. J. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20:53–65, 1987.
  42. Estimating the support of a high-dimensional distribution. Neural computation, 13(7):1443–1471, 2001.
  43. Sharkey, A. J. Linear and order statistics combiners for pattern classification. In Combining artificial neural nets, pp.  127–161. Springer, 1999a.
  44. Sharkey, A. J. Multi-net systems. In Combining artificial neural nets, pp.  1–30. Springer, 1999b.
  45. Visual analytics of anomalous user behaviors: A survey. IEEE Transactions on Big Data, 2020.
  46. Outside the closed world: On using machine learning for network intrusion detection. In 2010 IEEE symposium on security and privacy, pp. 305–316. IEEE, 2010.
  47. Robust anomaly detection for multivariate time series through stochastic recurrent neural network. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp.  2828–2837, 2019.
  48. Recurrent neural network language models for open vocabulary event-level cyber anomaly detection. In Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
  49. Identifying suspicious user behavior with neural networks. In 2017 IEEE 4th International Conference on Cyber Security and Cloud Computing, pp.  255–263. IEEE, 2017.
  50. Detecting intrusions using system calls: Alternative data models. In Proceedings of the 1999 IEEE symposium on security and privacy, pp.  133–145. IEEE, 1999.
  51. Weise, K. A lie detector test for online reviewers. Bloomberg Business Week, 2011.
  52. A deep learning enabled subspace spectral ensemble clustering approach for web anomaly detection. In 2017 International Joint Conference on Neural Networks (IJCNN), pp.  3896–3903. IEEE, 2017.
  53. Zhang, C. et al. A deep neural network for unsupervised anomaly detection and diagnosis in multivariate time series data. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pp.  1409–1416, 2019.
Citations (2)

Summary

We haven't generated a summary for this paper yet.