MULDE: Multiscale Log-Density Estimation via Denoising Score Matching for Video Anomaly Detection (2403.14497v1)
Abstract: We propose a novel approach to video anomaly detection: we treat feature vectors extracted from videos as realizations of a random variable with a fixed distribution and model this distribution with a neural network. This lets us estimate the likelihood of test videos and detect video anomalies by thresholding the likelihood estimates. We train our video anomaly detector using a modification of denoising score matching, a method that injects training data with noise to facilitate modeling its distribution. To eliminate hyperparameter selection, we model the distribution of noisy video features across a range of noise levels and introduce a regularizer that tends to align the models for different levels of noise. At test time, we combine anomaly indications at multiple noise scales with a Gaussian mixture model. Running our video anomaly detector induces minimal delays as inference requires merely extracting the features and forward-propagating them through a shallow neural network and a Gaussian mixture model. Our experiments on five popular video anomaly detection benchmarks demonstrate state-of-the-art performance, both in the object-centric and in the frame-centric setup.
- Ubnormal: New benchmark for supervised open-set video anomaly detection. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Ssmtl++: Revisiting self-supervised multi-task learning for video anomaly detection. Computer Vision and Image Understanding, 229, 2023.
- Appearance-motion memory consistency network for video anomaly detection. In Proc. of the Conference on Artificial Intelligence (AAAI), 2021.
- Clustering driven deep autoencoder for video anomaly detection. In Proc. of the European Conference on Computer Vision (ECCV), 2020.
- Histograms of oriented optical flow and binet-cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2009.
- Histograms of optical flow orientation and magnitude and entropy to detect anomalous events in videos. Transactions on Circuits and Systems for Video Technology, 27(3):673–682, 2016.
- Rmpe: Regional multi-person pose estimation. In Proc. of the International Conference on Computer Vision (ICCV), 2017.
- Convolutional transformer based dual discriminator generative adversarial networks for video anomaly detection. In Proc. of the International Conference on Multimedia, 2021.
- Multimodal motion conditioned diffusion model for skeleton-based video anomaly detection. In Proc. of the International Conference on Computer Vision (ICCV), 2023.
- Anomaly detection in video via self-supervised and multi-task learning. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021a.
- A background-agnostic framework with adversarial training for abnormal event detection in video. Transactions on Pattern Analysis and Machine Intelligence (PAMI), 44(9), 2021b.
- Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. In Proc. of the International Conference on Computer Vision (ICCV), 2019.
- Learning temporal regularity in video sequences. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- Normalizing flows for human pose anomaly detection. In Proc. of the International Conference on Computer Vision (ICCV), 2023.
- Flownet 2.0: Evolution of optical flow estimation with deep networks. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
- Object-centric auto-encoders and dummy anomalies for abnormal event detection in video. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
- The kinetics human action video dataset. arXiv preprint arXiv:1705.06950, 2017.
- Adam: A method for stochastic optimization. In Proc. of the International Conference on Learning Representations (ICLR), 2015.
- Why normalizing flows fail to detect out-of-distribution data. In Proc. of the Conference on Neural Information Processing Systems (NeurIPS), 2020.
- A tutorial on energy-based learning. Predicting Structured Data, 1, 2006.
- Stan: Spatio- temporal adversarial networks for abnormal event detection. In Proc. of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018.
- Bman: Bidirectional multi-scale aggregation networks for abnormal event detection. Transactions on Image Processing, 29, 2019.
- Future frame prediction for anomaly detection–a new baseline. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
- A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction. In Proc. of the International Conference on Computer Vision (ICCV), 2021.
- Abnormal event detection at 150 fps in matlab. In Proc. of the International Conference on Computer Vision (ICCV), 2013.
- A revisit of sparse coding based anomaly detection in stacked rnn framework. In Proc. of the International Conference on Computer Vision (ICCV), 2017.
- Learning Normal Dynamics in Videos with Meta Prototype Network. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- Anomaly detection in crowded scenes. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2010.
- Multiscale score matching for out-of-distribution detection. Proc. of the International Conference on Learning Representations (ICLR), 2021.
- Anomaly detection in video sequence with appearance-motion correspondence. In Proc. of the International Conference on Computer Vision (ICCV), 2019.
- Learning memory-guided normality for anomaly detection. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- Learning transferable visual models from natural language supervision. In Proc. of the International Conference on Machine Learning (ICML), 2021.
- Street scene: A new dataset and evaluation protocol for video anomaly detection. In Proc. of the Winter Conference on Applications of Computer Vision (WACV), 2020.
- Learning a distance function with a siamese network to localize anomalies in videos. In Proc. of the Winter Conference on Applications of Computer Vision (WACV), 2020.
- Attribute-based representations for accurate and interpretable video anomaly detection. arXiv preprint arXiv:2212.00789, 2022.
- Self-supervised predictive convolutional attentive block for anomaly detection. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Hiera: A hierarchical vision transformer without the bells-and-whistles. Proc. of the International Conference on Machine Learning (ICML), 2023.
- Generative modeling by estimating gradients of the data distribution. In Proc. of the Conference on Neural Information Processing Systems (NeurIPS), 2019.
- Real-world anomaly detection in surveillance videos. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
- Scene-aware context reasoning for unsupervised abnormal event detection in videos. In Proc. of the International Conference on Multimedia, 2020.
- Pascal Vincent. A connection between score matching and denoising autoencoders. Neural Computation, 23(7):1661–1674, 2011.
- Video anomaly detection by solving decoupled spatio-temporal jigsaw puzzles. In Proc. of the European Conference on Computer Vision (ECCV), 2022.
- Gods: Generalized one-class discriminative subspaces for anomaly detection. In Proc. of the International Conference on Computer Vision (ICCV), 2019.
- Cluster attention contrast for video anomaly detection. In Proc. of the International Conference on Multimedia, 2020.
- Feature prediction diffusion model for video anomaly detection. In Proc. of the International Conference on Computer Vision (ICCV), 2023.
- Dynamic local aggregation network with adaptive clusterer for anomaly detection. In Proc. of the European Conference on Computer Vision (ECCV), 2022.
- Cloze test helps: Effective video anomaly detection via learning to complete video events. In Proc. of the International Conference on Multimedia, 2020.
- Generative cooperative learning for unsupervised video anomaly detection. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.