Papers
Topics
Authors
Recent
2000 character limit reached

Real-world Anomaly Detection in Surveillance Videos

Published 12 Jan 2018 in cs.CV | (1801.04264v3)

Abstract: Surveillance videos are able to capture a variety of realistic anomalies. In this paper, we propose to learn anomalies by exploiting both normal and anomalous videos. To avoid annotating the anomalous segments or clips in training videos, which is very time consuming, we propose to learn anomaly through the deep multiple instance ranking framework by leveraging weakly labeled training videos, i.e. the training labels (anomalous or normal) are at video-level instead of clip-level. In our approach, we consider normal and anomalous videos as bags and video segments as instances in multiple instance learning (MIL), and automatically learn a deep anomaly ranking model that predicts high anomaly scores for anomalous video segments. Furthermore, we introduce sparsity and temporal smoothness constraints in the ranking loss function to better localize anomaly during training. We also introduce a new large-scale first of its kind dataset of 128 hours of videos. It consists of 1900 long and untrimmed real-world surveillance videos, with 13 realistic anomalies such as fighting, road accident, burglary, robbery, etc. as well as normal activities. This dataset can be used for two tasks. First, general anomaly detection considering all anomalies in one group and all normal activities in another group. Second, for recognizing each of 13 anomalous activities. Our experimental results show that our MIL method for anomaly detection achieves significant improvement on anomaly detection performance as compared to the state-of-the-art approaches. We provide the results of several recent deep learning baselines on anomalous activity recognition. The low recognition performance of these baselines reveals that our dataset is very challenging and opens more opportunities for future work. The dataset is available at: https://webpages.uncc.edu/cchen62/dataset.html

Citations (1,336)

Summary

  • The paper introduces a deep MIL framework that detects anomalies in surveillance videos by leveraging weakly labeled video-level annotations.
  • The proposed method employs a novel ranking loss with sparsity and temporal smoothness constraints to differentiate anomalous segments.
  • The study validates the approach on a large-scale dataset of 1900 videos, outperforming state-of-the-art methods in ROC and AUC metrics.

Real-world Anomaly Detection in Surveillance Videos

This essay provides an expert summary of the paper "Real-world Anomaly Detection in Surveillance Videos" (1801.04264), which introduces a novel approach for detecting anomalies within surveillance video footage using a deep multiple instance learning (MIL) ranking framework.

Introduction

The paper addresses the growing need for automated systems to detect anomalies in surveillance videos due to the increasing deployment of these cameras in public spaces and the limitation in human monitoring capabilities. It presents an innovative MIL-based anomaly detection framework that leverages weakly labeled training data. This enables the model to learn and predict anomalies without the need for segment-level annotations, which are typically labor-intensive to obtain. The authors introduce a large-scale dataset to aid in the evaluation and future research of anomaly detection.

Problem Formulation and Methodology

Multiple Instance Learning Framework

The research implements MIL for anomaly detection, where surveillance videos are divided into temporal segments (instances), and classified as either anomalous or normal based on video-level labels (bags). The MIL approach cleverly circumvents the need for precise segment-level annotations by focusing on video-level labels, allowing the anomaly detection model to be trained using weakly labeled data. Figure 1

Figure 1: The flow diagram of the proposed anomaly detection approach. Segments of surveillance videos are treated as instances in a bag-level MIL framework, powered by a deep learning network.

Deep MIL Ranking Model

The core methodology poses anomaly detection as a regression problem within a deep MIL ranking framework. The ranking model predicts higher anomaly scores for anomalous segments compared to normal segments, using a novel ranking loss function that includes sparsity and temporal smoothness constraints. These constraints ensure that anomaly detection accommodates the temporal characteristics of video data and reflects real-world conditions where anomalies occur sporadically and transitions between events are smooth. Specifically, the ranking loss is applied only to the maximum anomaly-scored instances in each positive (anomalous) and negative (normal) bag. Figure 2

Figure 3: Evolution of score on a training video over iterations. As iterations increase, the method effectively differentiates between anomalous and normal video segments.

Dataset and Experimental Validation

The authors introduce a pioneering dataset comprising 1900 untrimmed surveillance videos capturing 13 different real-world anomalies such as theft, assault, and vandalism. The dataset is notably the largest of its kind with 128 hours of video, far surpassing the scale and scope of previous anomaly detection datasets.

Comparison with State-of-the-art

The proposed method outperforms existing approaches, including sparse-coding-based methods and deep autoencoders, in anomaly detection. The evaluation uses an ROC and AUC analysis on their dataset, showing the superiority of the proposed MIL framework. Figure 4

Figure 5: ROC comparison of binary classifier (blue), Lu et al.'s method (cyan), Hasan et al.'s autoencoder (black), and the proposed method without (magenta) and with (red) constraints.

Qualitative Analysis

Qualitative results on testing videos highlight the model's ability to detect and localize anomalies accurately, as well as identify failure cases, which often occur due to poor visibility or highly nuanced normal behaviors misinterpreted as anomalies. Figure 6

Figure 7: Qualitative results of the proposed method on testing videos, showcasing successful anomaly detection and highlighting some instances of false positives.

Conclusions

The paper contributes a novel method for anomaly detection in surveillance videos using a deep MIL ranking framework with weakly labeled data. It also introduces a comprehensive dataset that sets a new standard for evaluating video anomaly detection methods. While achieving superior performance compared to state-of-the-art methods, it highlights the necessity and potential of weakly supervised learning for real-world applications in complex environments. Future work may focus on addressing identified failure cases and further leveraging the introduced dataset to enhance anomaly recognition.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.