Beyond the Benchmark: Detecting Diverse Anomalies in Videos (2310.01904v1)
Abstract: Video Anomaly Detection (VAD) plays a crucial role in modern surveillance systems, aiming to identify various anomalies in real-world situations. However, current benchmark datasets predominantly emphasize simple, single-frame anomalies such as novel object detection. This narrow focus restricts the advancement of VAD models. In this research, we advocate for an expansion of VAD investigations to encompass intricate anomalies that extend beyond conventional benchmark boundaries. To facilitate this, we introduce two datasets, HMDB-AD and HMDB-Violence, to challenge models with diverse action-based anomalies. These datasets are derived from the HMDB51 action recognition dataset. We further present Multi-Frame Anomaly Detection (MFAD), a novel method built upon the AI-VAD framework. AI-VAD utilizes single-frame features such as pose estimation and deep image encoding, and two-frame features such as object velocity. They then apply a density estimation algorithm to compute anomaly scores. To address complex multi-frame anomalies, we add a deep video encoding features capturing long-range temporal dependencies, and logistic regression to enhance final score calculation. Experimental results confirm our assumptions, highlighting existing models limitations with new anomaly types. MFAD excels in both simple and complex anomaly detection scenarios.
- UBnormal: New Benchmark for Supervised Open-Set Video Anomaly Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20143–20153, 2022. URL https://openaccess.thecvf.com/content/CVPR2022/html/Acsintoae_UBnormal_New_Benchmark_for_Supervised_Open-Set_Video_Anomaly_Detection_CVPR_2022_paper.html.
- Robust real-time unusual event detection using multiple fixed-location monitors. IEEE transactions on pattern analysis and machine intelligence, 30(3):555–560, March 2008. ISSN 0162-8828. doi: 10.1109/TPAMI.2007.70825.
- SSMTL++: Revisiting Self-Supervised Multi-Task Learning for Video Anomaly Detection, February 2023. URL http://arxiv.org/abs/2207.08003. arXiv:2207.08003 [cs].
- ActivityNet: A Large-Scale Video Benchmark for Human Activity Understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 961–970, 2015. URL https://openaccess.thecvf.com/content_cvpr_2015/html/Heilbron_ActivityNet_A_Large-Scale_2015_CVPR_paper.html.
- Appearance-Motion Memory Consistency Network for Video Anomaly Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 35(2):938–946, May 2021. ISSN 2374-3468. doi: 10.1609/aaai.v35i2.16177. URL https://ojs.aaai.org/index.php/AAAI/article/view/16177. Number: 2.
- A New Comprehensive Benchmark for Semi-Supervised Video Anomaly Detection and Anticipation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20392–20401, 2023. URL https://openaccess.thecvf.com/content/CVPR2023/html/Cao_A_New_Comprehensive_Benchmark_for_Semi-Supervised_Video_Anomaly_Detection_and_CVPR_2023_paper.
- A Short Note about Kinetics-600, August 2018. URL http://arxiv.org/abs/1808.01340. arXiv:1808.01340 [cs].
- Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1932–1939, June 2009. doi: 10.1109/CVPR.2009.5206821. ISSN: 1063-6919.
- Histograms of Optical Flow Orientation and Magnitude and Entropy to Detect Anomalous Events in Videos. IEEE Transactions on Circuits and Systems for Video Technology, 27(3):673–682, March 2017. ISSN 1558-2205. doi: 10.1109/TCSVT.2016.2637778. Conference Name: IEEE Transactions on Circuits and Systems for Video Technology.
- A Geometric Framework for Unsupervised Anomaly Detection. In Daniel Barbará and Sushil Jajodia (eds.), Applications of Data Mining in Computer Security, Advances in Information Security, pp. 77–101. Springer US, Boston, MA, 2002. ISBN 978-1-4615-0953-0. doi: 10.1007/978-1-4615-0953-0˙4. URL https://doi.org/10.1007/978-1-4615-0953-0_4.
- Video Anomaly Detection and Localization via Gaussian Mixture Fully Convolutional Variational Autoencoder, May 2018. URL http://arxiv.org/abs/1805.11223. arXiv:1805.11223 [cs].
- Anomaly Detection in Video via Self-Supervised and Multi-Task Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12742–12752, 2021. URL https://openaccess.thecvf.com/content/CVPR2021/html/Georgescu_Anomaly_Detection_in_Video_via_Self-Supervised_and_Multi-Task_Learning_CVPR_2021_paper.html.
- A Background-Agnostic Framework With Adversarial Training for Abnormal Event Detection in Video. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(9):4505–4523, September 2022. ISSN 1939-3539. doi: 10.1109/TPAMI.2021.3074805. Conference Name: IEEE Transactions on Pattern Analysis and Machine Intelligence.
- Unsupervised Representation Learning by Predicting Image Rotations, March 2018. URL http://arxiv.org/abs/1803.07728. arXiv:1803.07728 [cs].
- Ensemble Gaussian mixture models for probability density estimation. Computational Statistics, 27:127–138, December 2013. doi: 10.1007/s00180-012-0374-5.
- Memorizing Normality to Detect Anomaly: Memory-augmented Deep Autoencoder for Unsupervised Anomaly Detection, August 2019. URL http://arxiv.org/abs/1904.02639. arXiv:1904.02639 [cs].
- The ”Something Something” Video Database for Learning and Evaluating Visual Common Sense. In Proceedings of the IEEE International Conference on Computer Vision, pp. 5842–5850, 2017. URL https://openaccess.thecvf.com/content_iccv_2017/html/Goyal_The_Something_Something_ICCV_2017_paper.html.
- Learning Temporal Regularity in Video Sequences. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 733–742, 2016. URL https://openaccess.thecvf.com/content_cvpr_2016/html/Hasan_Learning_Temporal_Regularity_CVPR_2016_paper.
- The Kinetics Human Action Video Dataset, May 2017. URL http://arxiv.org/abs/1705.06950. arXiv:1705.06950 [cs].
- HMDB: A Large Video Database for Human Motion Recognition. In Proceedings of the International Conference on Computer Vision (ICCV), 2011.
- Outlier Detection with Kernel Density Functions. In Petra Perner (ed.), Machine Learning and Data Mining in Pattern Recognition, Lecture Notes in Computer Science, pp. 61–75, Berlin, Heidelberg, 2007. Springer. ISBN 978-3-540-73499-4. doi: 10.1007/978-3-540-73499-4˙6.
- Future Frame Prediction for Anomaly Detection – A New Baseline. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6536–6545, 2018. URL https://openaccess.thecvf.com/content_cvpr_2018/html/Liu_Future_Frame_Prediction_CVPR_2018_paper.
- A Hybrid Video Anomaly Detection Framework via Memory-Augmented Flow Reconstruction and Flow-Guided Frame Prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13588–13597, 2021. URL https://openaccess.thecvf.com/content/ICCV2021/html/Liu_A_Hybrid_Video_Anomaly_Detection_Framework_via_Memory-Augmented_Flow_Reconstruction_ICCV_2021_paper.
- David G. Lowe. Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision, 60(2):91–110, November 2004. ISSN 1573-1405. doi: 10.1023/B:VISI.0000029664.99615.94. URL https://doi.org/10.1023/B:VISI.0000029664.99615.94.
- Abnormal Event Detection at 150 FPS in MATLAB. In 2013 IEEE International Conference on Computer Vision, pp. 2720–2727, December 2013. doi: 10.1109/ICCV.2013.338. ISSN: 2380-7504.
- Future Frame Prediction Using Convolutional VRNN for Anomaly Detection. In 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–8, September 2019. doi: 10.1109/AVSS.2019.8909850. ISSN: 2643-6213.
- Remembering history with convolutional LSTM for anomaly detection. In 2017 IEEE International Conference on Multimedia and Expo (ICME), pp. 439–444, July 2017. doi: 10.1109/ICME.2017.8019325. ISSN: 1945-788X.
- Learning Normal Dynamics in Videos with Meta Prototype Network, May 2021. URL http://arxiv.org/abs/2104.06689. arXiv:2104.06689 [cs].
- Anomaly detection in crowded scenes. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1975–1981, San Francisco, CA, USA, June 2010. IEEE. ISBN 978-1-4244-6984-0. doi: 10.1109/CVPR.2010.5539872. URL http://ieeexplore.ieee.org/document/5539872/.
- Anomaly Detection in Video Sequence With Appearance-Motion Correspondence. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1273–1283, 2019. URL https://openaccess.thecvf.com/content_ICCV_2019/html/Nguyen_Anomaly_Detection_in_Video_Sequence_With_Appearance-Motion_Correspondence_ICCV_2019_paper.
- University of Minnesota. Unusual crowd activity dataset of university of minnesota, 2006. URL http://mha.cs.umn.edu/proj_events.shtml.
- Learning Memory-Guided Normality for Anomaly Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14372–14381, 2020. URL https://openaccess.thecvf.com/content_CVPR_2020/html/Park_Learning_Memory-Guided_Normality_for_Anomaly_Detection_CVPR_2020_paper.
- Histograms of optical flow for efficient representation of body motion. Pattern Recognition Letters, 31(11):1369–1376, August 2010. ISSN 0167-8655. doi: 10.1016/j.patrec.2010.03.024. URL https://www.sciencedirect.com/science/article/pii/S0167865510001121.
- Qualcomm. Moving Objects Dataset: Something-Something v. 2, 2018. URL https://developer.qualcomm.com/software/ai-datasets/something-something.
- Street Scene: A new dataset and evaluation protocol for video anomaly detection. In 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 2558–2567, March 2020. doi: 10.1109/WACV45572.2020.9093457. ISSN: 2642-9381.
- Attribute-based Representations for Accurate and Interpretable Video Anomaly Detection, December 2022. URL http://arxiv.org/abs/2212.00789. arXiv:2212.00789 [cs].
- Multi-timescale Trajectory Prediction for Abnormal Human Activity Detection. In 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 2615–2623, March 2020. doi: 10.1109/WACV45572.2020.9093633. ISSN: 2642-9381.
- Deep and Sparse features For Anomaly Detection and Localization in video. In 2019 4th International Conference on Pattern Recognition and Image Analysis (IPRIA), pp. 173–178, March 2019. doi: 10.1109/PRIA.2019.8786007. ISSN: 2049-3630.
- A Short Note on the Kinetics-700-2020 Human Action Dataset, October 2020. URL http://arxiv.org/abs/2010.10864. arXiv:2010.10864 [cs].
- UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild, December 2012. URL http://arxiv.org/abs/1212.0402. arXiv:1212.0402 [cs].
- Hierarchical Semantic Contrast for Scene-Aware Video Anomaly Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 22846–22856, 2023. URL https://openaccess.thecvf.com/content/CVPR2023/html/Sun_Hierarchical_Semantic_Contrast_for_Scene-Aware_Video_Anomaly_Detection_CVPR_2023_paper.
- Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles. In Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, and Tal Hassner (eds.), Computer Vision – ECCV 2022, Lecture Notes in Computer Science, pp. 494–511, Cham, 2022. Springer Nature Switzerland. ISBN 978-3-031-20080-9. doi: 10.1007/978-3-031-20080-9˙29.
- VideoMAE V2: Scaling Video Masked Autoencoders With Dual Masking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14549–14560, 2023. URL https://openaccess.thecvf.com/content/CVPR2023/html/Wang_VideoMAE_V2_Scaling_Video_Masked_Autoencoders_With_Dual_Masking_CVPR_2023_paper.html.
- Learning and Using the Arrow of Time. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8052–8060, 2018. URL https://openaccess.thecvf.com/content_cvpr_2018/html/Wei_Learning_and_Using_CVPR_2018_paper.html.
- Dynamic Local Aggregation Network with Adaptive Clusterer for Anomaly Detection, July 2022. URL http://arxiv.org/abs/2207.10948. arXiv:2207.10948 [cs].
- Cloze Test Helps: Effective Video Anomaly Detection via Learning to Complete Video Events. In Proceedings of the 28th ACM International Conference on Multimedia, MM ’20, pp. 583–591, New York, NY, USA, October 2020. Association for Computing Machinery. ISBN 978-1-4503-7988-5. doi: 10.1145/3394171.3413973. URL https://dl.acm.org/doi/10.1145/3394171.3413973.
- Deep Anomaly Discovery From Unlabeled Videos via Normality Advantage and Self-Paced Refinement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13987–13998, 2022. URL https://openaccess.thecvf.com/content/CVPR2022/html/Yu_Deep_Anomaly_Discovery_From_Unlabeled_Videos_via_Normality_Advantage_and_CVPR_2022_paper.html.
- Yoav Arad (1 paper)
- Michael Werman (31 papers)