Deep Semi-Supervised Anomaly Detection (1906.02694v2)

Published 6 Jun 2019 in cs.LG and stat.ML

Abstract: Deep approaches to anomaly detection have recently shown promising results over shallow methods on large and complex datasets. Typically anomaly detection is treated as an unsupervised learning problem. In practice however, one may have---in addition to a large set of unlabeled samples---access to a small pool of labeled samples, e.g. a subset verified by some domain expert as being normal or anomalous. Semi-supervised approaches to anomaly detection aim to utilize such labeled samples, but most proposed methods are limited to merely including labeled normal samples. Only a few methods take advantage of labeled anomalies, with existing deep approaches being domain-specific. In this work we present Deep SAD, an end-to-end deep methodology for general semi-supervised anomaly detection. We further introduce an information-theoretic framework for deep anomaly detection based on the idea that the entropy of the latent distribution for normal data should be lower than the entropy of the anomalous distribution, which can serve as a theoretical interpretation for our method. In extensive experiments on MNIST, Fashion-MNIST, and CIFAR-10, along with other anomaly detection benchmark datasets, we demonstrate that our method is on par or outperforms shallow, hybrid, and deep competitors, yielding appreciable performance improvements even when provided with only little labeled data.

PDF Abstract

Deep Semi-Supervised Anomaly Detection

The paper "Deep Semi-Supervised Anomaly Detection" proposes a novel methodology called Deep SAD, aimed at enhancing anomaly detection by leveraging both labeled and unlabeled data. The research diverges from traditional unsupervised approaches by incorporating labeled anomalies, addressing a gap where existing deep anomaly detection methods often focus only on labeled normal samples.

Main Contributions

This work offers several key contributions:

Introduction of Deep SAD: Deep SAD extends the Deep SVDD framework for semi-supervised anomaly detection, integrating labeled anomalies alongside unlabeled samples to provide a more robust detection process. The method adjusts the latent representations through an end-to-end training process, optimizing the model's ability to differentiate between normal and anomalous data.
Information-Theoretic Framework: The authors introduce an information-theoretic perspective, suggesting that the entropy of the latent representation for normal data should be minimized more stringently compared to that of anomalous data. This framework offers a new avenue for interpreting and improving anomaly detection methods.
Experimental Evaluation: Comprehensive experiments on datasets such as MNIST, Fashion-MNIST, CIFAR-10, and traditional AD benchmarks demonstrate the superior or comparable performance of Deep SAD against shallow, hybrid, and deep competitors in various scenarios. These scenarios include variable amounts of labeled data, pollution of unlabeled data, and diversity in labeled anomaly classes.

Numerical Results

The paper presents strong numerical results, with Deep SAD consistently outperforming others, especially on complex datasets like CIFAR-10. In scenarios where labeled anomaly data is increased, Deep SAD showed significant improvement, emphasizing the impact of leveraging labeled anomalies. Additionally, the method displayed robustness to polluted training data, maintaining competitive performance across different anomaly class numbers.

Implications and Future Directions

Practically, the research implies that incorporating even a small amount of labeled anomaly data can significantly enhance anomaly detection systems. This capability is particularly vital in domains where anomalies are rare but critical, such as fraud detection, network security, and medical diagnosis.

Theoretically, the introduction of an information-theoretic interpretation provides a foundation for future exploration into optimizing the balance between mutual information and entropy in latent representations. This framework might inspire new methodologies that further exploit structured information in data to improve anomaly detection in increasingly complex environments.

Future work could explore the application of this framework to other domains, investigate scalability to even larger datasets, and refine the model's sensitivity to different amounts of labeled data. Additionally, understanding how to efficiently incorporate domain-specific knowledge into this framework remains an open question that could drive practical advancements in the field.

Overall, Deep SAD represents a notable advancement in anomaly detection, integrating theoretical insights with practical demands, and opening new pathways for research in semi-supervised learning.

PDF Markdown Bookmark Chat (Pro)

Authors (7)

Lukas Ruff (17 papers)
Robert A. Vandermeulen (23 papers)
Nico Görnitz (5 papers)
Alexander Binder (38 papers)
Emmanuel Müller (26 papers)
Klaus-Robert Müller (167 papers)
Marius Kloft (65 papers)

Citations (495)

View on Semantic Scholar

Deep Semi-Supervised Anomaly Detection (1906.02694v2)