Papers
Topics
Authors
Recent
Search
2000 character limit reached

Anomaly Detection with Variance Stabilized Density Estimation

Published 1 Jun 2023 in cs.LG and cs.AI | (2306.00582v2)

Abstract: We propose a modified density estimation problem that is highly effective for detecting anomalies in tabular data. Our approach assumes that the density function is relatively stable (with lower variance) around normal samples. We have verified this hypothesis empirically using a wide range of real-world data. Then, we present a variance-stabilized density estimation problem for maximizing the likelihood of the observed samples while minimizing the variance of the density around normal samples. To obtain a reliable anomaly detector, we introduce a spectral ensemble of autoregressive models for learning the variance-stabilized distribution. We have conducted an extensive benchmark with 52 datasets, demonstrating that our method leads to state-of-the-art results while alleviating the need for data-specific hyperparameter tuning. Finally, we have used an ablation study to demonstrate the importance of each of the proposed components, followed by a stability analysis evaluating the robustness of our model.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. A survey of anomaly detection techniques in financial domain. Future Generation Computer Systems, 55:278–288, 2016.
  2. Variational autoencoder based anomaly detection using reconstruction probability. Special lecture on IE, 2(1):1–18, 2015.
  3. Fast outlier detection in high dimensional spaces. In Principles of Data Mining and Knowledge Discovery: 6th European Conference, PKDD 2002 Helsinki, Finland, August 19–23, 2002 Proceedings 6, pages 15–27. Springer, 2002.
  4. Classification-based anomaly detection for general data. arXiv preprint arXiv:2005.02359, 2020.
  5. Christopher M Bishop. Novelty detection and neural network validation. IEE Proceedings-Vision, Image and Signal processing, 141(4):217–222, 1994.
  6. Array based earthquakes-explosion discrimination using diffusion maps. Pure and Applied Geophysics, 178:2403–2418, 2021.
  7. Lof: identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pages 93–104, 2000.
  8. API design for machine learning software: experiences from the scikit-learn project. In ECML PKDD Workshop: Languages for Data Mining and Machine Learning, pages 108–122, 2013.
  9. Entropic issues in likelihood-based ood detection. In I (Still) Can’t Believe It’s Not Better! Workshop at NeurIPS 2021, pages 21–26, 2022.
  10. Anomaly detection using one-class neural networks. arXiv preprint arXiv:1802.06360, 2018.
  11. Outlier detection with autoencoder ensembles. In Proceedings of the 2017 SIAM international conference on data mining, pages 90–98. SIAM, 2017.
  12. Autoencoder-based network anomaly detection. In 2018 Wireless telecommunications symposium (WTS), pages 1–5. IEEE, 2018.
  13. Waic, but why? generative ensembles for robust anomaly detection. arXiv preprint arXiv:1810.01392, 2018.
  14. Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516, 2014.
  15. Density estimation using real nvp. arXiv preprint arXiv:1605.08803, 2016.
  16. Benchmarking optimization software with performance profiles. Mathematical programming, 91:201–213, 2002.
  17. Neural spline flows. Advances in neural information processing systems, 32, 2019.
  18. Hiv viral transcription and immune perturbations in the cns of people with hiv despite art. JCI insight, 7(13), 2022.
  19. Deep learning for medical anomaly detection–a survey. ACM Computing Surveys (CSUR), 54(7):1–37, 2021.
  20. Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm. KI-2012: poster and demo track, 1:59–63, 2012.
  21. Adbench: Anomaly detection benchmark. In Neural Information Processing Systems (NeurIPS), 2022.
  22. Deep anomaly detection with outlier exposure. arXiv preprint arXiv:1812.04606, 2018.
  23. A review of anomaly detection techniques and applications in financial fraud. Expert Systems with Applications, page 116429, 2021.
  24. Flow++: Improving flow-based generative models with variational dequantization and architecture design. In International Conference on Machine Learning, pages 2722–2730. PMLR, 2019.
  25. Histopathologic and machine deep learning criteria to predict lymphoma transformation in bone marrow biopsies. Archives of Pathology & Laboratory Medicine, 146(2):182–193, 2022.
  26. Estimating the accuracies of multiple classifiers without labeled data. In Artificial Intelligence and Statistics, pages 407–415. PMLR, 2015.
  27. Improved variational inference with inverse autoregressive flow. Advances in neural information processing systems, 29, 2016.
  28. Perfect density models cannot guarantee anomaly detection. Entropy, 23(12):1690, 2021.
  29. Anomaly detection in the probability simplex under different geometries. Information Geometry, pages 1–28, 2023.
  30. Neural inverse transform sampler. In International Conference on Machine Learning, pages 12813–12825. PMLR, 2022.
  31. Copod: copula-based outlier detection. In 2020 IEEE international conference on data mining (ICDM), pages 1118–1123. IEEE, 2020.
  32. Ecod: Unsupervised outlier detection using empirical cumulative distribution functions. IEEE Transactions on Knowledge and Data Engineering, 2022.
  33. Probabilistic robust autoencoders for outlier detection. arXiv preprint arXiv:2110.00494, 2021.
  34. Isolation forest. In 2008 eighth ieee international conference on data mining, pages 413–422. IEEE, 2008.
  35. Anomaly detection in manufacturing systems using structured neural networks. In 2018 13th world congress on intelligent control and automation (wcica), pages 175–180. IEEE, 2018.
  36. Energy-based out-of-distribution detection. Advances in neural information processing systems, 33:21464–21475, 2020.
  37. On diffusion modeling for anomaly detection. arXiv preprint arXiv:2305.18593, 2023.
  38. Butterflyflow: Building invertible layers with butterfly matrices. In International Conference on Machine Learning, pages 15360–15375. PMLR, 2022.
  39. Anomaly detection with density estimation. Physical Review D, 101(7):075042, 2020.
  40. Detecting out-of-distribution inputs to deep generative models using typicality. arXiv preprint arXiv:1906.02994, 2019.
  41. Deep anomaly detection with deviation networks. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pages 353–362, 2019.
  42. Neural transformation learning for deep anomaly detection beyond images. In International Conference on Machine Learning, pages 8703–8714. PMLR, 2021.
  43. Latent outlier exposure for anomaly detection with contaminated data. In International Conference on Machine Learning, pages 18153–18167. PMLR, 2022.
  44. Earthquake-explosion discrimination using diffusion maps. Geophysical Journal International, 207(3):1484–1492, 2016.
  45. Shebuti Rayana. Odds library (2016). URL http://odds. cs. stonybrook. edu, 2016.
  46. Noise regularization for conditional density estimation. arXiv preprint arXiv:1907.08982, 2019.
  47. Deep one-class classification. In International conference on machine learning, pages 4393–4402. PMLR, 2018.
  48. Pixelcnn++: Improving the pixelcnn with discretized logistic mixture likelihood and other modifications. arXiv preprint arXiv:1701.05517, 2017.
  49. Anomaly detection for tabular data with internal contrastive learning. In International Conference on Learning Representations, 2022.
  50. Rnade: The real-valued neural autoregressive density-estimator. Advances in Neural Information Processing Systems, 26, 2013.
  51. Nonparametric density estimation for high-dimensional data—algorithms and applications. Wiley Interdisciplinary Reviews: Computational Statistics, 11(4):e1461, 2019.
  52. Asymptotic equivalence of bayes cross validation and widely applicable information criterion in singular learning theory. Journal of machine learning research, 11(12), 2010.
  53. Uadb: Unsupervised anomaly detection booster. arXiv preprint arXiv:2306.01997, 2023.
  54. Tabadm: Unsupervised tabular anomaly detection with diffusion models. arXiv preprint arXiv:2307.12336, 2023.
  55. Anomaly detection with robust deep autoencoders. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pages 665–674, 2017.
Citations (2)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.