Sub-Image Anomaly Detection with Deep Pyramid Correspondences (2005.02357v3)

Published 5 May 2020 in cs.CV and cs.LG

Abstract: Nearest neighbor (kNN) methods utilizing deep pre-trained features exhibit very strong anomaly detection performance when applied to entire images. A limitation of kNN methods is the lack of segmentation map describing where the anomaly lies inside the image. In this work we present a novel anomaly segmentation approach based on alignment between an anomalous image and a constant number of the similar normal images. Our method, Semantic Pyramid Anomaly Detection (SPADE) uses correspondences based on a multi-resolution feature pyramid. SPADE is shown to achieve state-of-the-art performance on unsupervised anomaly detection and localization while requiring virtually no training time.

Citations (411)

View on Semantic Scholar

Summary

The paper introduces SPADE, a method that combines pre-trained deep features with multi-level pyramid correspondences for precise sub-image anomaly detection.
The methodology uses a kNN search to retrieve normal images and computes dense pixel-wise anomaly scores for effective segmentation.
Experimental results on industrial and surveillance datasets demonstrate that SPADE outperforms autoencoder and GAN-based methods, highlighting its practical impact.

Sub-Image Anomaly Detection with Deep Pyramid Correspondences: An Overview

The paper "Sub-Image Anomaly Detection with Deep Pyramid Correspondences" by Niv Cohen and Yedid Hoshen introduces a method called Semantic Pyramid Anomaly Detection (SPADE) for anomaly segmentation in images. In contrast to traditional k-nearest neighbor (kNN) approaches that assess entire images, SPADE provides the ability to localize anomalies by using correspondence alignments between an anomaly image and several normal images. Notably, it operates without the need for extensive training, leveraging pre-trained deep features effectively.

Anomaly detection poses substantial challenges due to the unknown nature of anomalies. Cohen and Hoshen's SPADE tackles these challenges by focusing on a normal-only training setting. This setting, also known as semi-supervised anomaly detection, assumes access only to normal instance data during the training phase, offering practical benefits due to ease of data acquisition. The primary goal of SPADE is not just to classify an image as normal or anomalous but also to provide a detailed segmentation map pinpointing anomalies within an image.

Methodology and Novel Contributions

SPADE's methodology can be broken down into three main components:

Feature Extraction: SPADE employs a pre-trained deep neural network, such as an ImageNet-trained ResNet, to extract features from images. This step ensures rich, high-quality features from several levels in the network, forming a pyramid of features. The multi-level feature pyramid ensures both local and global context are preserved, which is essential for precise alignment and anomaly detection.
Normal Image Retrieval: The method utilizes an initial kNN search to find the most similar normal images from a training set relative to the anomalous image. The distance is measured using the Euclidean metric applied to the feature representations.
Pixel-Level Anomaly Detection through Correspondence: SPADE cues into dense pixel-level correspondences between the anomaly image and its k nearest normal neighbors. The pixel anomaly score is determined based on this correspondence, where regions without close matches in the retrieved normal images are marked as anomalous.

Experimental Evaluation

The paper presents extensive evaluation on two datasets: MVTech, simulating industrial fault detection scenarios, and the ShanghaiTech Campus (STC) dataset, representing a surveillance setting. SPADE demonstrates state-of-the-art performance, surpassing existing methodologies such as autoencoding and GAN-based anomaly detection models on both pixel-level segmentation and image-level detection tasks.

One noteworthy finding is SPADE's efficacy in using pre-trained features without additional fine-tuning. This suggests that high-quality, generic features extracted from canonical datasets like ImageNet are well-suited to capture both broad and localized contextual information essential for anomaly detection. The paper further acknowledges that incorporating methods to fine-tune these features could potentially elevate performance even further.

Implications and Future Work

The practical implications of SPADE are significant for fields requiring real-time anomaly detection with minimal setup complexity. Examples include industrial quality control, surveillance for unusual activities, and even medical imaging where swift and accurate anomaly localization is crucial.

Theoretically, the work contributes to a growing body of research emphasizing pre-trained features' utility in specialized tasks. While SPADE effectively demonstrates the power of pre-trained deep features combined with a feature pyramid strategy for sub-image alignment, future work could explore advancements in feature matching efficiency, which remains computationally demanding, especially in pixel-wise anomaly detection scenarios.

Additionally, exploring ensemble methods or augmenting SPADE with other unsupervised learning tools might yield further enhancements, particularly in environments with highly complex anomaly patterns or in datasets with limited volumes of normal data samples.

In conclusion, "Sub-Image Anomaly Detection with Deep Pyramid Correspondences" offers a compelling advancement in the domain of anomaly detection, built on a foundation of robust, pre-trained deep features and innovative alignment mechanisms. It sets a new benchmark that future methodological improvements can build upon, aiming to further refine anomaly detection and segmentation in digital imagery.

PDF Markdown

Related Papers

YouTube

Show All Videos