Deep Anomaly Detection with Deviation Networks (1911.08623v1)

Published 19 Nov 2019 in cs.LG and stat.ML

Abstract: Although deep learning has been applied to successfully address many data mining problems, relatively limited work has been done on deep learning for anomaly detection. Existing deep anomaly detection methods, which focus on learning new feature representations to enable downstream anomaly detection methods, perform indirect optimization of anomaly scores, leading to data-inefficient learning and suboptimal anomaly scoring. Also, they are typically designed as unsupervised learning due to the lack of large-scale labeled anomaly data. As a result, they are difficult to leverage prior knowledge (e.g., a few labeled anomalies) when such information is available as in many real-world anomaly detection applications. This paper introduces a novel anomaly detection framework and its instantiation to address these problems. Instead of representation learning, our method fulfills an end-to-end learning of anomaly scores by a neural deviation learning, in which we leverage a few (e.g., multiple to dozens) labeled anomalies and a prior probability to enforce statistically significant deviations of the anomaly scores of anomalies from that of normal data objects in the upper tail. Extensive results show that our method can be trained substantially more data-efficiently and achieves significantly better anomaly scoring than state-of-the-art competing methods.

PDF Abstract

Deep Anomaly Detection with Deviation Networks

The paper "Deep Anomaly Detection with Deviation Networks" introduces a novel approach that leverages deep learning for anomaly detection tasks by introducing a framework that allows for the end-to-end training of a neural network aimed at directly optimizing anomaly scores. Traditionally, anomaly detection has been treated as a two-step process: first learning representations and then using these representations to detect anomalies. The authors propose to bypass the intermediate representation learning step and directly optimize anomaly scores by introducing a neural deviation learning approach.

Core Contributions

End-to-End Anomaly Score Optimization: The framework introduced in the paper shifts away from representation learning towards direct score optimization. This is achieved by optimizing a neural network to output anomaly scores directly. The network leverages a small number of labeled anomalies, which are often available in practical scenarios, to guide the learning process, thereby making the approach more data-efficient than existing methods.
Deviation Networks (DevNet): The paper presents DevNet as an instantiation of the proposed framework. DevNet uses a Z-Score-based deviation loss function and a Gaussian prior to enforce that anomaly scores of anomalous data points statistically deviate from normal data points in the feature space. The deviation loss helps to systematically push the scores of normal objects towards a reference score derived from a prior probability, typically Gaussian, and ensures anomalous objects significantly deviate from this score.
Effective Use of Labeled Anomalies: By leveraging the limited number of labeled anomalies effectively, DevNet demonstrates the ability to utilize prior knowledge about anomalies to significantly optimize the learning process. This feature enables DevNet to sustain its performance even with high-dimensional data and considerable noise levels.

Experimental Validation

The empirical evaluation conducted in this paper demonstrates that DevNet consistently outperforms state-of-the-art methods on several benchmark datasets across varied domains, including cybersecurity, finance, and healthcare. These results highlight the superior data efficiency achieved by DevNet—the model requires substantially fewer labeled examples to reach performance levels on par with or superior to traditional two-step deep learning methods.

In terms of quantitative performance:

DevNet shows significant AUC-ROC improvements, ranging from 3% to 29% over comparisons and substantial AUC-PR improvements of 21% to 309%.
The end-to-end framework enables the learning system to more closely tie the objective function with the practical detection metrics, contributing to better alignment between training objectives and anomaly detection outcomes.

Implications and Future Directions

The implications of this research extend beyond the current state of anomaly detection, providing a robust foundation for integrating minimal supervised data into an inherently unsupervised learning domain, thereby bridging the gap between limited labeled anomalies and efficient anomaly scoring. It underscores the potential of direct score optimization in deep learning for anomaly detection, opening up paths for improvements in high-dimensional and noisy environments.

Future work could explore incorporating a hybrid model that balances data-driven and prior-driven settings, enhancing scalability and interpretability for extremely complex real-world settings. Additionally, applying this end-to-end learning framework to visual and sequential data types, using appropriate neural network architectures, could further validate and extend its applicability across diverse data types.

Overall, this paper provides a significant advancement in anomaly detection methodologies by offering an efficient and insightful approach that capitalizes on the minimal supervision available through labeled anomalies.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Guansong Pang (82 papers)
Chunhua Shen (404 papers)
Anton van den Hengel (188 papers)

Citations (311)

View on Semantic Scholar

Deep Anomaly Detection with Deviation Networks (1911.08623v1)