Multiresolution Knowledge Distillation for Anomaly Detection (2011.11108v1)

Published 22 Nov 2020 in cs.CV

Abstract: Unsupervised representation learning has proved to be a critical component of anomaly detection/localization in images. The challenges to learn such a representation are two-fold. Firstly, the sample size is not often large enough to learn a rich generalizable representation through conventional techniques. Secondly, while only normal samples are available at training, the learned features should be discriminative of normal and anomalous samples. Here, we propose to use the "distillation" of features at various layers of an expert network, pre-trained on ImageNet, into a simpler cloner network to tackle both issues. We detect and localize anomalies using the discrepancy between the expert and cloner networks' intermediate activation values given the input data. We show that considering multiple intermediate hints in distillation leads to better exploiting the expert's knowledge and more distinctive discrepancy compared to solely utilizing the last layer activation values. Notably, previous methods either fail in precise anomaly localization or need expensive region-based training. In contrast, with no need for any special or intensive training procedure, we incorporate interpretability algorithms in our novel framework for the localization of anomalous regions. Despite the striking contrast between some test datasets and ImageNet, we achieve competitive or significantly superior results compared to the SOTA methods on MNIST, F-MNIST, CIFAR-10, MVTecAD, Retinal-OCT, and two Medical datasets on both anomaly detection and localization.

Authors (5)

Mohammadreza Salehi (26 papers)
Niousha Sadjadi (3 papers)
Soroosh Baselizadeh (6 papers)
Mohammad Hossein Rohban (43 papers)
Hamid R. Rabiee (85 papers)

Citations (380)

View on Semantic Scholar

Summary

The paper demonstrates that multiresolution distillation using intermediate feature hints significantly improves anomaly detection and localization.
The paper introduces a cloner network architecture that efficiently mimics the expert network's critical feature representations while being computationally lightweight.
The paper leverages gradient-based interpretability to pinpoint anomalies without region-based training, achieving SOTA performance on diverse datasets.

Multiresolution Knowledge Distillation for Anomaly Detection: An Expert Analysis

The paper presents a sophisticated approach to anomaly detection and localization in images by capitalizing on unsupervised representation learning. The central premise is to leverage knowledge distillation from a pre-trained complex network, referred to as an "expert" network trained on ImageNet, into a simpler network termed the "cloner." This methodology addresses two significant challenges: the limited size of available training samples and the necessity for distinguishing between normal and anomalous samples, despite training with only normal instances.

Core Methodology

The proposed framework utilizes the discrepancies between intermediate activation layers of the expert and cloner networks to detect and localize anomalies. Distillation occurs across various layers, not just the final one, enhancing the cloner's ability to exploit the expert's comprehensive feature representation. This approach bypasses the requirement of intensive region-based training and instead incorporates interpretability algorithms to identify anomalous regions.

Evaluation and Results

The authors validate their approach on diverse datasets, including MNIST, F-MNIST, CIFAR-10, and several medical datasets, achieving remarkable results against state-of-the-art (SOTA) methods. Notably, the framework demonstrates SOTA performance in both detection and localization of anomalies, outperforming previous methods significantly on the MVTecAD and other challenging datasets.

Methodological Insights

Multilayer Distillation: By employing multiple intermediate "hints" during distillation, the method avoids the pitfalls of shallow feature interpretation, thereby enhancing its ability to generalize across unseen datasets.
Cloner Network Architecture: The compact architecture of the cloner network is pivotal, focusing primarily on the essential features that distinguish between normal and anomalous inputs without getting "distracted" by irrelevant features present in the expert network.
Adoption of Interpretability Methods: Utilization of gradient-based interpretability methods for precise anomaly localization without computationally expensive region-based training exemplifies an innovative means to harness the expert network's feature representation.

Implications and Future Directions

The paper's findings carry significant implications for real-time anomaly detection systems where computational efficiency without sacrificing accuracy is critical. The framework potentially shifts the paradigm in anomaly detection by showcasing how extensive knowledge from a deep learning model trained on an unrelated domain (ImageNet) can be effectively transferred to perform specialized tasks with limited domain-specific training data.

Future research can explore optimizing the cloner architecture further for different anomaly detection contexts or extending the framework's applicability to other anomaly-prone domains, such as cybersecurity or industrial inspection. Additionally, deeper investigations into the interpretability and robustness of the localization method will further solidify the utility of this technique in safety-critical applications.

Conclusion

The approach undertaken in this paper presents a compelling enhancement for anomaly detection and localization. Through multiresolution knowledge distillation and interpretability methods, the authors offer a scalable, effective solution adaptable to various datasets and domains, emphasizing methodological precision and computational efficiency.

PDF Markdown