Papers
Topics
Authors
Recent
Search
2000 character limit reached

Data-Efficient and Interpretable Tabular Anomaly Detection

Published 3 Mar 2022 in cs.LG | (2203.02034v2)

Abstract: Anomaly detection (AD) plays an important role in numerous applications. We focus on two understudied aspects of AD that are critical for integration into real-world applications. First, most AD methods cannot incorporate labeled data that are often available in practice in small quantities and can be crucial to achieve high AD accuracy. Second, most AD methods are not interpretable, a bottleneck that prevents stakeholders from understanding the reason behind the anomalies. In this paper, we propose a novel AD framework that adapts a white-box model class, Generalized Additive Models, to detect anomalies using a partial identification objective which naturally handles noisy or heterogeneous features. In addition, the proposed framework, DIAD, can incorporate a small amount of labeled data to further boost anomaly detection performances in semi-supervised settings. We demonstrate the superiority of our framework compared to previous work in both unsupervised and semi-supervised settings using diverse tabular datasets. For example, under 5 labeled anomalies DIAD improves from 86.2\% to 89.4\% AUC by learning AD from unlabeled data. We also present insightful interpretations that explain why DIAD deems certain samples as anomalies.

Citations (15)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.