Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach (2311.16514v2)

Published 27 Nov 2023 in cs.CV, cs.AI, and cs.LG

Abstract: Video Anomaly Detection (VAD) is an open-set recognition task, which is usually formulated as a one-class classification (OCC) problem, where training data is comprised of videos with normal instances while test data contains both normal and anomalous instances. Recent works have investigated the creation of pseudo-anomalies (PAs) using only the normal data and making strong assumptions about real-world anomalies with regards to abnormality of objects and speed of motion to inject prior information about anomalies in an autoencoder (AE) based reconstruction model during training. This work proposes a novel method for generating generic spatio-temporal PAs by inpainting a masked out region of an image using a pre-trained Latent Diffusion Model and further perturbing the optical flow using mixup to emulate spatio-temporal distortions in the data. In addition, we present a simple unified framework to detect real-world anomalies under the OCC setting by learning three types of anomaly indicators, namely reconstruction quality, temporal irregularity and semantic inconsistency. Extensive experiments on four VAD benchmark datasets namely Ped2, Avenue, ShanghaiTech and UBnormal demonstrate that our method performs on par with other existing state-of-the-art PAs generation and reconstruction based methods under the OCC setting. Our analysis also examines the transferability and generalisation of PAs across these datasets, offering valuable insights by identifying real-world anomalies through PAs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (105)
  1. Latent space autoregression for novelty detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 481–490, 2019.
  2. Ubnormal: New benchmark for supervised open-set video anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 20143–20153, 2022.
  3. Cross-domain video anomaly detection without target domain adaptation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2579–2591, 2023.
  4. Learning not to reconstruct anomalies. In BMVC, 2021a.
  5. Synthetic temporal anomaly guided end-to-end video anomaly detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 207–214, 2021b.
  6. Deep learners benefit more from out-of-distribution examples. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pages 164–172, Fort Lauderdale, FL, USA, 2011. PMLR.
  7. Appearance-motion memory consistency network for video anomaly detection. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2-9, 2021, pages 938–946. AAAI Press, 2021.
  8. Clustering driven deep autoencoder for video anomaly detection. In Proceedings of the European Conference on Computer Vision (ECCV), 2020.
  9. Good semi-supervised learning that requires a bad gan. Advances in neural information processing systems, 30, 2017.
  10. A discriminative framework for anomaly detection in large videos. In European Conference on Computer Vision, pages 334–349. Springer, 2016.
  11. Boundary of distribution support generator (bdsg): Sample generation on the boundary. In 2020 IEEE International Conference on Image Processing (ICIP), pages 803–807. IEEE, 2020.
  12. Dual discriminator generative adversarial network for video anomaly detection. IEEE Access, 8:88170–88176, 2020a.
  13. Dual discriminator generative adversarial network for video anomaly detection. IEEE Access, 8:88170–88176, 2020b.
  14. Margingan: adversarial training in semi-supervised learning. Advances in neural information processing systems, 32, 2019.
  15. Any-shot sequential anomaly detection in surveillance videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 934–935, 2020a.
  16. Continual learning for anomaly detection in surveillance videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 254–255, 2020b.
  17. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  18. Vos: Learning what you don’t know by virtual outlier synthesis. arXiv preprint arXiv:2202.01197, 2022.
  19. Taming transformers for high-resolution image synthesis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12873–12883, 2021.
  20. Recent advances in open set recognition: A survey. IEEE transactions on pattern analysis and machine intelligence, 43(10):3614–3631, 2020.
  21. Anomaly detection in video via self-supervised and multi-task learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12742–12752, 2021a.
  22. A background-agnostic framework with adversarial training for abnormal event detection in video. IEEE transactions on pattern analysis and machine intelligence, 44(9):4505–4523, 2021b.
  23. Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019a.
  24. Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. In Proceedings of the IEEE International Conference on Computer Vision, pages 1705–1714, 2019b.
  25. Learning temporal regularity in video sequences. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 733–742, 2016.
  26. Latent video diffusion models for high-fidelity video generation with arbitrary lengths. arXiv preprint arXiv:2211.13221, 2022.
  27. Joint detection and recounting of abnormal events by learning deep generic knowledge. In Proceedings of the IEEE International Conference on Computer Vision, pages 3619–3627, 2017.
  28. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  29. Video diffusion models, 2022.
  30. Object-centric auto-encoders and dummy anomalies for abnormal event detection in video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7842–7851, 2019a.
  31. Detecting abnormal events in video using narrowed normality clusters. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1951–1960. IEEE, 2019b.
  32. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1125–1134, 2017.
  33. One-class learned encoder-decoder network with adversarial context masking for novelty detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 3591–3601, 2022.
  34. Tam-net: Temporal enhanced appearance-to-motion generative network for video anomaly detection. In 2020 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2020.
  35. The kinetics human action video dataset. arXiv preprint arXiv:1705.06950, 2017.
  36. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (NIPS), pages 1097–1105, 2012.
  37. Smoothmix: A simple yet effective data augmentation to train robust classifiers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2020a.
  38. Stan: Spatio-temporal adversarial networks for abnormal event detection. In 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), pages 1323–1327. IEEE, 2018.
  39. Bman: Bidirectional multi-scale aggregation networks for abnormal event detection. IEEE Transactions on Image Processing, 29:2395–2408, 2019.
  40. Bman: Bidirectional multi-scale aggregation networks for abnormal event detection. IEEE Transactions on Image Processing, 29:2395–2408, 2020b.
  41. Anomaly detection and localization in crowded scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(1):18–32, 2014.
  42. A causal inference look at unsupervised video anomaly detection. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 1620–1629, 2022.
  43. Future frame prediction for anomaly detection–a new baseline. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6536–6545, 2018a.
  44. Classifier two sample test for video anomaly detections. In BMVC, page 71, 2018b.
  45. A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 13588–13597, 2021.
  46. Abnormal event detection at 150 fps in matlab. In 2013 IEEE International Conference on Computer Vision, pages 2720–2727, 2013.
  47. Future frame prediction using convolutional vrnn for anomaly detection. In 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pages 1–8. IEEE, 2019.
  48. Few-shot scene-adaptive anomaly detection. In European Conference on Computer Vision, pages 125–141. Springer, 2020.
  49. Remembering history with convolutional lstm for anomaly detection. In 2017 IEEE International Conference on Multimedia and Expo (ICME), pages 439–444, 2017a.
  50. A revisit of sparse coding based anomaly detection in stacked rnn framework. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017b.
  51. Remembering history with convolutional lstm for anomaly detection. In 2017 IEEE International Conference on Multimedia and Expo (ICME), pages 439–444. IEEE, 2017c.
  52. A revisit of sparse coding based anomaly detection in stacked rnn framework. ICCV, Oct, 1(2):3, 2017d.
  53. Anomaly detection in crowded scenes. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1975–1981. IEEE, 2010.
  54. Fake it until you make it: Towards accurate near-distribution novelty detection. In NeurIPS ML Safety Workshop.
  55. Fence gan: Towards better anomaly detection. In 2019 IEEE 31St International Conference on tools with artificial intelligence (ICTAI), pages 141–148. IEEE, 2019.
  56. Anomaly detection in video sequence with appearance-motion correspondence. In Proceedings of the IEEE International Conference on Computer Vision, pages 1273–1283, 2019.
  57. Self-trained deep ordinal regression for end-to-end video anomaly detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12173–12182, 2020.
  58. Learning memory-guided normality for anomaly detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14372–14381, 2020.
  59. G2d: Generate to detect anomaly. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 2003–2012, 2021.
  60. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  61. Street scene: A new dataset and evaluation protocol for video anomaly detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2569–2578, 2020.
  62. Learning a distance function with a siamese network to localize anomalies in videos. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2598–2607, 2020a.
  63. A survey of single-scene video anomaly detection. IEEE transactions on pattern analysis and machine intelligence, 44(5):2293–2312, 2020b.
  64. Fine-tuned clip models are efficient video learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6545–6554, 2023.
  65. Abnormal event detection in videos using generative adversarial nets. In 2017 IEEE International Conference on Image Processing (ICIP), pages 1577–1581. IEEE, 2017.
  66. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695, 2022.
  67. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
  68. Eval: Explainable video anomaly localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18717–18726, 2023.
  69. Deep appearance features for abnormal behavior detection in video. In International Conference on Image Analysis and Processing, pages 779–789. Springer, 2017.
  70. Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pages 2256–2265. PMLR, 2015.
  71. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020.
  72. Real-world anomaly detection in surveillance videos. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6479–6488, 2018.
  73. Scene-aware context reasoning for unsupervised abnormal event detection in videos. In Proceedings of the 28th ACM International Conference on Multimedia, pages 184–192, 2020.
  74. Resolution-robust large mask inpainting with fourier convolutions. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, pages 2149–2159, 2022.
  75. Integrating prediction and reconstruction for anomaly detection. Pattern Recognition Letters, 129:123–130, 2020a.
  76. Onlineaugment: Online data augmentation with less domain knowledge. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VII 16, pages 313–329. Springer, 2020b.
  77. Mcvd-masked conditional video diffusion for prediction, generation, and interpolation. Advances in Neural Information Processing Systems, 35:23371–23385, 2022.
  78. Robust anomaly detection in videos using multilevel representations. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 5216–5223, 2019.
  79. Video anomaly detection by solving decoupled spatio-temporal jigsaw puzzles. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part X, pages 494–511. Springer, 2022.
  80. Cluster attention contrast for video anomaly detection. In Proceedings of the 28th ACM International Conference on Multimedia, pages 2463–2471, 2020.
  81. Diffusion models for medical anomaly detection. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2022: 25th International Conference, Singapore, September 18–22, 2022, Proceedings, Part VIII, pages 35–45. Springer, 2022.
  82. A deep one-class neural network for anomalous event detection in complex scenes. IEEE transactions on neural networks and learning systems, 31(7):2609–2622, 2019.
  83. Not only look, but also listen: Learning multimodal violence detection under weak supervision. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXX 16, pages 322–339. Springer, 2020.
  84. Video anomaly detection based on a hierarchical activity discovery within spatio-temporal contexts. Neurocomputing, 143:144–152, 2014.
  85. Learning deep representations of appearance and motion for anomalous event detection. In BMVC, 2015.
  86. Detecting anomalous events in videos by learning deep representations of appearance and motion. Computer Vision and Image Understanding, 156:117–127, 2017.
  87. Dynamic local aggregation network with adaptive clusterer for anomaly detection. In European Conference on Computer Vision, pages 404–421. Springer, 2022.
  88. Video event restoration based on keyframes for video anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14592–14601, 2023.
  89. Anopcn: Video anomaly detection via deep predictive coding network. In Proceedings of the 27th ACM International Conference on Multimedia, pages 1805–1813, 2019.
  90. Cloze test helps: Effective video anomaly detection via learning to complete video events. In Proceedings of the 28th ACM International Conference on Multimedia, pages 583–591, 2020.
  91. A duality based approach for realtime tv-l 1 optical flow. In Pattern Recognition: 29th DAGM Symposium, Heidelberg, Germany, September 12-14, 2007. Proceedings 29, pages 214–223. Springer, 2007.
  92. Old is gold: Redefining the adversarially learned one-class classifier training paradigm. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14183–14193, 2020a.
  93. Claws: Clustering assisted weakly supervised learning with normalcy suppression for anomalous event detection. In European Conference on Computer Vision, pages 358–376. Springer, 2020b.
  94. A self-reasoning framework for anomaly detection using video-level labels. IEEE Signal Processing Letters, 27:1705–1709, 2020c.
  95. Stabilizing adversarially learned one-class novelty detection using pseudo anomalies. IEEE Transactions on Image Processing, 31:5963–5975, 2022a.
  96. Generative cooperative learning for unsupervised video anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14744–14754, 2022b.
  97. Cleaning label noise with clusters for minimally supervised anomaly detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, June 2020.
  98. Exploiting completeness and uncertainty of pseudo labels for weakly supervised video anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16271–16280, 2023.
  99. mixup: Beyond empirical risk minimization. In International Conference on Learning Representations, 2018.
  100. Adversarial autoaugment. arXiv preprint arXiv:1912.11188, 2019.
  101. Video anomaly detection based on locality sensitive hashing filters. Pattern Recognition, 59:302–311, 2016.
  102. Spatio-temporal autoencoder for video anomaly detection. In Proceedings of the 25th ACM International Conference on Multimedia, page 1933–1941, New York, NY, USA, 2017a. Association for Computing Machinery.
  103. Spatio-temporal autoencoder for video anomaly detection. In Proceedings of the 25th ACM international conference on Multimedia, pages 1933–1941, 2017b.
  104. Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
  105. Towards open set video anomaly detection. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXIV, pages 395–412. Springer, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Ayush K. Rai (7 papers)
  2. Tarun Krishna (10 papers)
  3. Feiyan Hu (9 papers)
  4. Alexandru Drimbarean (4 papers)
  5. Kevin McGuinness (76 papers)
  6. Alan F. Smeaton (85 papers)
  7. Noel E. O'Connor (70 papers)
Citations (1)
X Twitter Logo Streamline Icon: https://streamlinehq.com