Object-centric Auto-encoders and Dummy Anomalies for Abnormal Event Detection in Video (1812.04960v2)

Published 11 Dec 2018 in cs.CV and cs.LG

Abstract: Abnormal event detection in video is a challenging vision problem. Most existing approaches formulate abnormal event detection as an outlier detection task, due to the scarcity of anomalous data during training. Because of the lack of prior information regarding abnormal events, these methods are not fully-equipped to differentiate between normal and abnormal events. In this work, we formalize abnormal event detection as a one-versus-rest binary classification problem. Our contribution is two-fold. First, we introduce an unsupervised feature learning framework based on object-centric convolutional auto-encoders to encode both motion and appearance information. Second, we propose a supervised classification approach based on clustering the training samples into normality clusters. A one-versus-rest abnormal event classifier is then employed to separate each normality cluster from the rest. For the purpose of training the classifier, the other clusters act as dummy anomalies. During inference, an object is labeled as abnormal if the highest classification score assigned by the one-versus-rest classifiers is negative. Comprehensive experiments are performed on four benchmarks: Avenue, ShanghaiTech, UCSD and UMN. Our approach provides superior results on all four data sets. On the large-scale ShanghaiTech data set, our method provides an absolute gain of 8.4% in terms of frame-level AUC compared to the state-of-the-art method [Sultani et al., CVPR 2018].

Citations (307)

View on Semantic Scholar

Summary

The paper proposes a novel framework that reformulates abnormal event detection as binary classification using object-centric convolutional auto-encoders.
It leverages k-means clustering to form normality clusters and creates dummy anomalies for training one-versus-rest classifiers.
Empirical results show significant improvements, notably an 8.4% AUC gain on the ShanghaiTech dataset, demonstrating robust anomaly detection.

Object-centric Auto-encoders and Dummy Anomalies for Abnormal Event Detection in Video

The paper presents an innovative approach to abnormal event detection in video, introducing a methodology that departs from traditional outlier detection, instead focusing on treating abnormal event detection as a binary classification task. The paper's central thesis is two-fold. Firstly, it proposes an unsupervised feature learning framework leveraging object-centric convolutional auto-encoders (CAEs) to capture both motion and appearance information. Secondly, it introduces a supervised classification approach employing one-versus-rest classifiers in conjunction with clustering to differentiate normal events from anomalies effectively.

Methodology Overview

The paper posits the notion of abnormal event detection as a multi-class classification problem, achieved by clustering normal events into 'normality clusters' using k-means. Each cluster is treated as a separate class, and a one-versus-rest binary classifier is trained to distinguish one cluster from the others. Normality clusters serve as proxies to generate dummy anomalies to train these classifiers. The classifier's objective is refined by considering 'dummy anomalies,' i.e., data from other normal clusters which are not the cluster currently being distinguished. This approach aims to simulate the presence of anomalous samples indirectly.

The feature learning is critically dependent on convolutional auto-encoders, designed to extract meaningful latent representations from detected objects within the frames. Specifically, the approach constructs three auto-encoders: one encoder for capturing static appearance features and two distinct encoders to capture the motion representation through gradient changes over frames.

Experimental Results

Empirical validation was conducted on four well-known datasets—Avenue, ShanghaiTech, UCSD Ped2, and UMN—demonstrating the methodology's robustness and scalability. The results indicate a notable improvement in frame-level area under curve (AUC) scores across all datasets. Particularly, on the large-scale ShanghaiTech dataset, the approach achieves an 8.4% absolute gain in AUC compared to the prior leading method. The framework proposes practical enhancements with frame-level AUC scores consistently surpassing other techniques, validating the model's effectiveness in abnormality detection in varied environmental conditions and scenes.

Implications and Future Developments

Pragmatically, the method offers significant potential for applications in surveillance and other security-sensitive domains, where real-time accurate abnormality detection is crucial. The theoretical contribution sheds light on the efficacy of using dummy anomalies and the adaptability of CAEs for feature extraction in contexts where anomalies are rare or poorly defined beforehand.

Looking forward, the possibility to enhance the framework by incorporating advanced segmentation and tracking of objects is discussed, pointing towards a more refined resolution and analysis of abnormal events at a granular level. Furthermore, integrating stronger forms of motion analysis and developing more adaptive clustering techniques might enhance robustness, potentially leading to more nuanced multi-class categorization frameworks that automatically adjust to dynamic environments.

The paper's approach sets a promising direction for future research, stimulating further exploration into classification-based abnormality detection models that can dynamically learn and adapt in complex real-world scenarios, extending beyond the constraints of traditional anomaly detection paradigms.

PDF Markdown