Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Invariant Anomaly Detection under Distribution Shifts: A Causal Perspective (2312.14329v1)

Published 21 Dec 2023 in cs.LG

Abstract: Anomaly detection (AD) is the machine learning task of identifying highly discrepant abnormal samples by solely relying on the consistency of the normal training samples. Under the constraints of a distribution shift, the assumption that training samples and test samples are drawn from the same distribution breaks down. In this work, by leveraging tools from causal inference we attempt to increase the resilience of anomaly detection models to different kinds of distribution shifts. We begin by elucidating a simple yet necessary statistical property that ensures invariant representations, which is critical for robust AD under both domain and covariate shifts. From this property, we derive a regularization term which, when minimized, leads to partial distribution invariance across environments. Through extensive experimental evaluation on both synthetic and real-world tasks, covering a range of six different AD methods, we demonstrated significant improvements in out-of-distribution performance. Under both covariate and domain shift, models regularized with our proposed term showed marked increased robustness. Code is available at: https://github.com/JoaoCarv/invariant-anomaly-detection.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. A survey of anomaly detection techniques in financial domain. Future Generation Computer Systems, 55:278–288, 2016.
  2. Anomalib: A deep learning library for anomaly detection, 2022.
  3. Wasserstein generative adversarial networks. In D. Precup and Y. W. Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 214–223. PMLR, 06–11 Aug 2017. URL https://proceedings.mlr.press/v70/arjovsky17a.html.
  4. Invariant risk minimization. arXiv preprint arXiv:1907.02893, 2019.
  5. From detection of individual metastases to classification of lymph node status at the patient level: the camelyon17 challenge. IEEE Transactions on Medical Imaging, 38(2):550–560, 2018.
  6. Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4183–4192, 2020.
  7. Pattern recognition and machine learning, volume 4. Springer, 2006.
  8. A simple framework for contrastive learning of visual representations. arXiv preprint arXiv:2002.05709, 2020.
  9. Functional map of the world. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6172–6180, 2018.
  10. Red PANDA: Disambiguating image anomaly detection by removing nuisance factors. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=z37tDDHHgi.
  11. H. Deng and X. Li. Anomaly detection via reverse distillation from one-class embedding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9737–9746, June 2022.
  12. L. Deng. The MNIST database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6):141–142, 2012.
  13. Diagvib-6: A diagnostic benchmark suite for vision models in the presence of shortcut and generalization opportunities. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10655–10664, 2021.
  14. Shortcut learning in deep neural networks. Nature Machine Intelligence, 2(11):665–673, 2020.
  15. Memorizing normality to detect anomaly: memory-augmented deep autoencoder for unsupervised anomaly detection. In IEEE International Conference on Computer Vision (ICCV), 2019.
  16. DROCC: Deep robust one-class classification. In International conference on machine learning, pages 3711–3721. PMLR, 2020.
  17. A kernel method for the two-sample-problem. In B. Schölkopf, J. Platt, and T. Hoffman, editors, Advances in Neural Information Processing Systems, volume 19. MIT Press, 2006.
  18. A kernel two-sample test. The Journal of Machine Learning Research, 13(1):723–773, 2012.
  19. Momentum contrast for unsupervised visual representation learning. arXiv preprint arXiv:1911.05722, 2019.
  20. Financial fraud: A review of anomaly detection techniques and recent advances. 2022.
  21. Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:1808.06670, 2018.
  22. Improving security using SVM-based anomaly detection: issues and challenges. Soft Computing, 25:3195–3223, 2021.
  23. Y. Jiang and V. Veitch. Invariant and transportable representations for anti-causal domain shifts. arXiv preprint arXiv:2207.01603, 2022.
  24. J. Kahana and Y. Hoshen. A contrastive objective for learning disentangled representations. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXVI, pages 579–595. Springer, 2022.
  25. Contrastive adaptation network for unsupervised domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
  26. Wilds: A benchmark of in-the-wild distribution shifts. In International Conference on Machine Learning, pages 5637–5664. PMLR, 2021.
  27. A mutual information maximization perspective of language representation learning. arXiv preprint arXiv:1910.08350, 2019.
  28. CFA: Coupled-hypersphere-based feature adaptation for target-oriented anomaly localization. arXiv preprint arXiv:2206.04325, 2022.
  29. Domain generalization with adversarial feature learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5400–5409, 2018.
  30. Zin: When and how to learn invariance without environment partition? Advances in Neural Information Processing Systems, 35:24529–24542, 2022.
  31. R. Linsker. Self-organization in a perceptual network. Computer, 21(3):105–117, 1988.
  32. Learning causal semantic representation for out-of-distribution prediction. Advances in Neural Information Processing Systems, 34:6155–6170, 2021.
  33. Transfer feature learning with joint distribution adaptation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), December 2013.
  34. Improving robustness without sacrificing accuracy with patch gaussian augmentation. arXiv preprint arXiv:1906.02611, 2019.
  35. The variational fair autoencoder. arXiv preprint arXiv:1511.00830, 2015.
  36. Causality inspired representation learning for domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8046–8056, June 2022.
  37. N. Meinshausen. Causality from a distributional robustness point of view. In 2018 IEEE Data Science Workshop (DSW), pages 6–10. IEEE, 2018.
  38. On the impact of spurious correlation for out-of-distribution detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 10051–10059, 2022.
  39. Representation learning via invariant causal mechanisms. arXiv preprint arXiv:2010.07922, 2020.
  40. Federated-learning-based anomaly detection for IoT security attacks. IEEE Internet of Things Journal, 9(4):2545–2554, 2021.
  41. Domain invariant representation learning with domain density transformations. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, editors, Advances in Neural Information Processing Systems, volume 34, pages 5264–5275. Curran Associates, Inc., 2021. URL https://proceedings.neurips.cc/paper_files/paper/2021/file/2a2717956118b4d223ceca17ce3865e2-Paper.pdf.
  42. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018.
  43. Learning memory-guided normality for anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14372–14381, 2020.
  44. A survey of anomaly detection in industrial wireless sensor networks with critical water system infrastructure as a case study. Sensors, 18(8):2491, 2018.
  45. T. Reiss and Y. Hoshen. Mean-shifted contrastive loss for anomaly detection. arXiv preprint arXiv:2106.03844, 2021.
  46. Panda: Adapting pretrained features for anomaly detection and segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2806–2814, 2021.
  47. A unifying review of deep and shallow anomaly detection. Proceedings of the IEEE, 109(5):756–795, 2021.
  48. Distributionally robust neural networks. In International Conference on Learning Representations, 2019.
  49. f-AnoGAN: Fast unsupervised anomaly detection with generative adversarial networks. Medical Image Analysis, 54:30–44, 2019.
  50. Detecting cyber attacks using anomaly detection with explanations and expert feedback. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2872–2876. IEEE, 2019.
  51. Env-aware anomaly detection: Ignore style changes, stay true to content! arXiv preprint arXiv:2210.03103, 2022.
  52. Decomposed mutual information estimation for contrastive representation learning. In International Conference on Machine Learning, pages 9859–9869. PMLR, 2021.
  53. J. V. Stone. Independent component analysis: a tutorial introduction. MIT press, 2004.
  54. CSI: Novelty detection via contrastive learning on distributionally shifted instances. In Advances in Neural Information Processing Systems, 2020.
  55. Quantifying the effects of data augmentation and stain color normalization in convolutional neural networks for computational pathology. Medical Image Analysis, 58:101544, 2019.
  56. N. Tishby and N. Zaslavsky. Deep learning and the information bottleneck principle. In 2015 IEEE Information Theory Workshop (ITW), pages 1–5. IEEE, 2015.
  57. Counterfactual invariance to spurious correlations: Why and how to pass stress tests. arXiv preprint arXiv:2106.00545, 2021.
  58. Student-teacher feature pyramid matching for anomaly detection. In The British Machine Vision Conference (BMVC), 2021.
  59. Out-of-distribution generalization with causal invariant transformations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 375–385, 2022.
  60. Z. Wang and V. Veitch. A unified causal view of domain invariant representation learning. arXiv preprint arXiv:2208.06987, 2022.
  61. On mutual information in contrastive learning for visual representations. arXiv preprint arXiv:2005.13149, 2020.
  62. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017.
  63. Learning semantic context from normal samples for unsupervised anomaly detection. Proceedings of the AAAI Conference on Artificial Intelligence, 35(4):3110–3118, May 2021. doi: 10.1609/aaai.v35i4.16420. URL https://ojs.aaai.org/index.php/AAAI/article/view/16420.
  64. Improving out-of-distribution robustness via selective augmentation. In Proceeding of the Thirty-ninth International Conference on Machine Learning, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. João B. S. Carvalho (3 papers)
  2. Mengtao Zhang (2 papers)
  3. Robin Geyer (2 papers)
  4. Carlos Cotrini (5 papers)
  5. Joachim M. Buhmann (47 papers)