PANDA: Adapting Pretrained Features for Anomaly Detection and Segmentation (2010.05903v3)

Published 12 Oct 2020 in cs.CV and cs.LG

Abstract: Anomaly detection methods require high-quality features. In recent years, the anomaly detection community has attempted to obtain better features using advances in deep self-supervised feature learning. Surprisingly, a very promising direction, using pretrained deep features, has been mostly overlooked. In this paper, we first empirically establish the perhaps expected, but unreported result, that combining pretrained features with simple anomaly detection and segmentation methods convincingly outperforms, much more complex, state-of-the-art methods. In order to obtain further performance gains in anomaly detection, we adapt pretrained features to the target distribution. Although transfer learning methods are well established in multi-class classification problems, the one-class classification (OCC) setting is not as well explored. It turns out that naive adaptation methods, which typically work well in supervised learning, often result in catastrophic collapse (feature deterioration) and reduce performance in OCC settings. A popular OCC method, DeepSVDD, advocates using specialized architectures, but this limits the adaptation performance gain. We propose two methods for combating collapse: i) a variant of early stopping that dynamically learns the stopping iteration ii) elastic regularization inspired by continual learning. Our method, PANDA, outperforms the state-of-the-art in the OCC, outlier exposure and anomaly segmentation settings by large margins.

Citations (229)

View on Semantic Scholar

Summary

The paper demonstrates that adapting pretrained deep features with early stopping and elastic regularization significantly boosts anomaly detection performance.
It achieves a remarkable 96.2% ROCAUC on CIFAR10 and strong segmentation results on the MVTec dataset.
Simple baselines like DN2 and SPADE confirm that using pretrained features can outperform more complex state-of-the-art methods.

Analysis of "PANDA: Adapting Pretrained Features for Anomaly Detection and Segmentation"

The paper "PANDA: Adapting Pretrained Features for Anomaly Detection and Segmentation" presents a significant contribution to the field of anomaly detection by proposing a method leveraging pretrained features over self-supervised features. The authors critically evaluate the effectiveness of pretrained features in various anomaly detection settings, prominently featuring the one-class classification (OCC) setting, anomaly segmentation, and outlier exposure scenarios.

Anomaly detection obtained notable advancements by integrating deep learning techniques, which often expand upon classical methods by incorporating deep neural networks. Traditional methods, while useful, are limited to small datasets or require significant domain-specific feature engineering. The exploration of improved features using deep self-supervised feature learning has gained traction, yet the potential of pretrained features has been largely untapped within this domain.

The main assertion of the paper is that pretrained features, combined with straightforward anomaly detection and segmentation methods, outperform complex state-of-the-art approaches. As opposed to standard assumptions that rely heavily on self-supervised features, the use of pretrained features yields superior generality and effectiveness. This empirical assertion challenges existing methodology within the field. Remarkably, the paper exhibits a strong quantitative performance with the PANDA method achieving 96.2% ROCAUC on CIFAR10 without outlier exposure, substantially outpacing the 90.1% benchmark of past methods.

Addressing the feature adaptation for anomaly detection, the authors emphasize mitigating the problem of catastrophic collapse, which is a significant challenge when using naive transfer learning approaches. PANDA proposes two prime mechanisms: an improved strategy of early stopping tailored to the anomaly detection context and elastic regularization inspired by continual learning paradigms. These methods directly confront feature deterioration and preclude performance pitfalls, especially in OCC settings.

The paper's practical relevance is underscored by its intuitive and straightforward baselines, exemplified by the "Deep Nearest Neighbors" (DN2) for anomaly detection and the "Semantic Pyramid Anomaly Detection" (SPADE) for anomaly segmentation. These baselines are reported to outperform state-of-the-art models effectively, using pretrained features without additional complexity.

For anomaly segmentation, their SPADE technique, which uses a pyramid of features from different layers of a ResNet model, enhanced anomaly detection performance beyond previous architectures. In segmentation tasks, obtaining a pixel-level ROCAUC of 96.2% on the MVTec dataset highlights SPADE’s practical utility and robustness compared with numerous established methods.

This paper not only underscores the strength of pretrained features but also invites further exploration into broader applications across different data modalities. The future development of generalized feature extractors could alleviate the dependency on specific modalities and datasets, paving the way for adaptive, cross-domain anomaly detection systems.

In conclusion, "PANDA: Adapting Pretrained Features for Anomaly Detection and Segmentation" presents a well-substantiated argument favoring pretrained deep features over traditional self-supervised mechanisms. The detailed numerical analyses and methodological innovations set the stage for enriched performance and broader adoption of such approaches in both industry and academia. The absence of a need for task-specific anomaly datasets further strengthens its applicability across diverse real-world settings.

PDF Markdown

PANDA: Adapting Pretrained Features for Anomaly Detection and Segmentation (2010.05903v3)

Summary

Analysis of "PANDA: Adapting Pretrained Features for Anomaly Detection and Segmentation"

Related Papers