Toward Generalist Anomaly Detection via In-context Residual Learning with Few-shot Sample Prompts (2403.06495v3)
Abstract: This paper explores the problem of Generalist Anomaly Detection (GAD), aiming to train one single detection model that can generalize to detect anomalies in diverse datasets from different application domains without any further training on the target data. Some recent studies have shown that large pre-trained Visual-LLMs (VLMs) like CLIP have strong generalization capabilities on detecting industrial defects from various datasets, but their methods rely heavily on handcrafted text prompts about defects, making them difficult to generalize to anomalies in other applications, e.g., medical image anomalies or semantic anomalies in natural images. In this work, we propose to train a GAD model with few-shot normal images as sample prompts for AD on diverse datasets on the fly. To this end, we introduce a novel approach that learns an in-context residual learning model for GAD, termed InCTRL. It is trained on an auxiliary dataset to discriminate anomalies from normal samples based on a holistic evaluation of the residuals between query images and few-shot normal sample prompts. Regardless of the datasets, per definition of anomaly, larger residuals are expected for anomalies than normal samples, thereby enabling InCTRL to generalize across different domains without further training. Comprehensive experiments on nine AD datasets are performed to establish a GAD benchmark that encapsulate the detection of industrial defect anomalies, medical anomalies, and semantic anomalies in both one-vs-all and multi-class setting, on which InCTRL is the best performer and significantly outperforms state-of-the-art competing methods. Code is available at https://github.com/mala-lab/InCTRL.
- Cross-domain video anomaly detection without target domain adaptation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2579–2591, 2023.
- Ganomaly: Semi-supervised anomaly detection via adversarial training. In Computer Vision–ACCV 2018: 14th Asian Conference on Computer Vision, Perth, Australia, December 2–6, 2018, Revised Selected Papers, Part III 14, pages 622–637. Springer, 2019.
- Flamingo: a visual language model for few-shot learning. Advances in Neural Information Processing Systems, 35:23716–23736, 2022.
- Visual prompting via image inpainting. Advances in Neural Information Processing Systems, 35:25005–25017, 2022.
- Fewsome: One-class few shot anomaly detection with siamese networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2977–2986, 2023.
- Classification-based anomaly detection for general data. arXiv preprint arXiv:2005.02359, 2020.
- Mvtec ad–a comprehensive real-world dataset for unsupervised anomaly detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9592–9600, 2019.
- Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4183–4192, 2020.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Dual-distribution discrepancy with self-supervised refinement for anomaly detection in medical images. Medical Image Analysis, 86:102794, 2023.
- Anomaly detection under distribution shift. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 6511–6523, 2023.
- A survey on visual anomaly detection: Challenge, approach, and prospect. arXiv preprint arXiv:2401.16402, 2024.
- Pix2seq: A language modeling framework for object detection. arXiv preprint arXiv:2109.10852, 2021.
- A unified sequence interface for vision tasks. Advances in Neural Information Processing Systems, 35:31333–31346, 2022a.
- Deep one-class classification via interpolated gaussian descriptor. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 383–392, 2022b.
- Sub-image anomaly detection with deep pyramid correspondences. arXiv preprint arXiv:2005.02357, 2020.
- Padim: a patch distribution modeling framework for anomaly detection and localization. In International Conference on Pattern Recognition, pages 475–489. Springer, 2021.
- Automatic classification of defective photovoltaic module cells in electroluminescence images. Solar Energy, 185:455–468, 2019.
- Anomaly detection via reverse distillation from one-class embedding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9737–9746, 2022.
- Catching both gray and black swans: Open-set supervised anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7388–7398, 2022.
- Language models are general-purpose interfaces. arXiv preprint arXiv:2206.06336, 2022.
- Divide-and-assemble: Learning block-wise memory for unsupervised anomaly detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8791–8800, 2021.
- Registration based few-shot anomaly detection. In European Conference on Computer Vision, pages 303–319. Springer, 2022.
- Openclip. Zenodo, 4:5, 2021.
- Winclip: Zero-/few-shot anomaly classification and segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19606–19616, 2023.
- Uvim: A unified modeling approach for vision with learned guiding codes. Advances in Neural Information Processing Systems, 35:26295–26308, 2022.
- Cifar-10 (canadian institute for advanced research). 2009. URL http://www. cs. toronto. edu/kriz/cifar. html, 5, 2009.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
- Attention based glaucoma detection: A large-scale database and cnn model, 2019.
- Coft-ad: Contrastive fine-tuning for few-shot anomaly detection. arXiv preprint arXiv:2402.18998, 2024.
- Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017.
- Diversity-measurable anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12147–12156, 2023.
- Unified-io: A unified model for vision, language, and multi-modal tasks. arXiv preprint arXiv:2206.08916, 2022.
- Few-shot scene-adaptive anomaly detection. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16, pages 125–141. Springer, 2020.
- Lesinn: Detecting anomalies by identifying least similar nearest neighbours. In 2015 IEEE international conference on data mining workshop (ICDMW), pages 623–630. IEEE, 2015.
- Learning representations of ultrahigh-dimensional data for random distance-based outlier detection. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 2041–2050, 2018.
- Deep learning for anomaly detection: A review. ACM computing surveys (CSUR), 54(2):1–38, 2021.
- Deep weakly-supervised anomaly detection. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 1795–1807, 2023.
- Learning memory-guided normality for anomaly detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14372–14381, 2020.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
- Towards total recall in industrial anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14318–14328, 2022.
- Deep semi-supervised anomaly detection. In ICLR, 2020.
- Multiresolution knowledge distillation for anomaly detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14902–14912, 2021.
- f-anogan: Fast unsupervised anomaly detection with generative adversarial networks. Medical image analysis, 54:30–44, 2019.
- Maeday: Mae for few and zero shot anomaly-detection. arXiv preprint arXiv:2211.14307, 2022.
- A hierarchical transformation-discriminating generative model for few shot anomaly detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8495–8504, 2021.
- A public fabric database for defect detection methods and results. Autex Research Journal, 19(4):363–374, 2019.
- Segmentation-based deep-learning approach for surface-defect detection. Journal of Intelligent Manufacturing, 31(3):759–776, 2020.
- Support vector data description. Machine learning, 54:45–66, 2004.
- Constrained contrastive distribution learning for unsupervised anomaly detection and localisation in medical images. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part V 24, pages 128–140. Springer, 2021.
- Self-supervised pseudo multi-class pre-training for unsupervised anomaly detection and segmentation in medical images. Medical image analysis, 90:102930, 2023.
- Revisiting reverse distillation for anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24511–24520, 2023.
- Student-teacher feature pyramid matching for anomaly detection. arXiv preprint arXiv:2103.04257, 2021.
- Ofa: Unifying architectures, tasks, and modalities through a simple sequence-to-sequence learning framework. In International Conference on Machine Learning, pages 23318–23340. PMLR, 2022a.
- Images speak in images: A generalist painter for in-context visual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6830–6839, 2023a.
- Seggpt: Segmenting everything in context. arXiv preprint arXiv:2304.03284, 2023b.
- Few-shot fast-adaptive anomaly detection. Advances in Neural Information Processing Systems, 35:4957–4970, 2022b.
- Learning unsupervised metaformer for anomaly detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4369–4378, 2021.
- Open-vocabulary video anomaly detection. arXiv preprint arXiv:2311.07042, 2023a.
- Vadclip: Adapting vision-language models for weakly supervised video anomaly detection. arXiv preprint arXiv:2308.11681, 2023b.
- Squid: Deep feature in-painting for unsupervised anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23890–23901, 2023.
- Pushing the limits of fewshot anomaly detection in industry vision: Graphcore. arXiv preprint arXiv:2301.12082, 2023.
- Machine unlearning: A survey. ACM Computing Surveys, 56(1):1–36, 2023.
- Learning semantic context from normal samples for unsupervised anomaly detection. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 3110–3118, 2021.
- Focus the discrepancy: Intra-and inter-correlation learning for image anomaly detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6803–6813, 2023a.
- One-for-all: Proposal masked cross-class anomaly detection. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 4792–4800, 2023b.
- Patch svdd: Patch-level svdd for anomaly detection and segmentation. In Proceedings of the Asian Conference on Computer Vision, 2020.
- Old is gold: Redefining the adversarially learned one-class classifier training paradigm. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14183–14193, 2020.
- Draem-a discriminatively trained reconstruction embedding for surface anomaly detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8330–8339, 2021a.
- Reconstruction by inpainting for visual anomaly detection. Pattern Recognition, 112:107706, 2021b.
- Destseg: Segmentation guided denoising student-teacher for anomaly detection, 2023.
- Conditional prompt learning for vision-language models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16816–16825, 2022.
- Anomalyclip: Object-agnostic prompt learning for zero-shot anomaly detection. In The Twelfth International Conference on Learning Representations, 2024.
- Anomaly heterogeneity learning for open-set supervised anomaly detection. arXiv preprint arXiv:2310.12790, 2023.
- Spot-the-difference self-supervised pre-training for anomaly detection and segmentation. In European Conference on Computer Vision, pages 392–408. Springer, 2022.
- Jiawen Zhu (30 papers)
- Guansong Pang (82 papers)