Papers
Topics
Authors
Recent
Search
2000 character limit reached

Graph Convolutional Label Noise Cleaner: Train a Plug-and-play Action Classifier for Anomaly Detection

Published 18 Mar 2019 in cs.CV | (1903.07256v1)

Abstract: Video anomaly detection under weak labels is formulated as a typical multiple-instance learning problem in previous works. In this paper, we provide a new perspective, i.e., a supervised learning task under noisy labels. In such a viewpoint, as long as cleaning away label noise, we can directly apply fully supervised action classifiers to weakly supervised anomaly detection, and take maximum advantage of these well-developed classifiers. For this purpose, we devise a graph convolutional network to correct noisy labels. Based upon feature similarity and temporal consistency, our network propagates supervisory signals from high-confidence snippets to low-confidence ones. In this manner, the network is capable of providing cleaned supervision for action classifiers. During the test phase, we only need to obtain snippet-wise predictions from the action classifier without any extra post-processing. Extensive experiments on 3 datasets at different scales with 2 types of action classifiers demonstrate the efficacy of our method. Remarkably, we obtain the frame-level AUC score of 82.12% on UCF-Crime.

Citations (365)

Summary

  • The paper introduces a GCN-based label noise cleaner that converts weakly supervised anomaly detection into a fully supervised task.
  • The method leverages feature similarity and temporal consistency through an EM-like alternate training of a GCN and action classifier.
  • Experiments on UCF-Crime, ShanghaiTech, and UCSD-Peds demonstrate improved AUC scores, confirming the approach's robustness.

Overview of "Graph Convolutional Label Noise Cleaner: Train a Plug-and-play Action Classifier for Anomaly Detection"

In this paper, Zhong et al. propose a novel approach for video anomaly detection under weak supervision, transforming what has traditionally been considered a multiple-instance learning (MIL) task into a supervised learning problem tainted with noisy labels. The essence of their contribution lies in the development of a Graph Convolutional Network (GCN) designed to clean label noise, thereby enabling the training of fully supervised action classifiers that can then be applied to weakly supervised anomaly detection tasks.

Methodology

The core innovation of the paper is the "Graph Convolutional Label Noise Cleaner," a mechanism that leverages the inherent temporal and feature similarity between video snippets to propagate anomaly information from high-confidence snippets to low-confidence ones. The authors introduce an EM-like optimization strategy, which alternates between training the label noise cleaner and re-training the action classifier with cleaned labels. The method consists of the following steps:

  1. Label Noise Cleaner Using GCN: The GCN aims to correct noisy labels by modeling both feature similarity and temporal consistency. Video snippets are represented as nodes in the graph, while edges encode the relationships based on feature and temporal proximity.
  2. Feature Similarity Graph Module: This module constructs an attributed graph where nodes represent video snippets and edges represent similarity in features.
  3. Temporal Consistency Graph Module: This module considers the temporal order of snippets, assuming that anomalies are likely to appear in close temporal proximity.
  4. Alternate Optimization: The GCN and the action classifier (e.g., C3D or TSN) are trained iteratively. Initially, the classifier is trained with noisy labels, after which the GCN cleans these labels. The cleaned labels are then used to re-train the classifier, and the process is repeated.

Experimental Results

The efficacy of the proposed method is validated on three datasets of varying scales: UCF-Crime, ShanghaiTech, and UCSD-Peds. The results demonstrate significant improvements in anomaly detection performance, validating the effectiveness of the alternate training framework and the GCN-based noise cleaning approach.

  • UCF-Crime: The model achieves a frame-level AUC score of 82.12% with TSN RGB, outperforming existing methods by a notable margin. This dataset demonstrates the model's capability to handle large-scale, real-world video data.
  • ShanghaiTech: Experimental results show improvements across all action classifiers, with the highest AUC reaching 84.44% with TSN RGB. This medium-scale dataset confirms the generalizability of the proposed approach.
  • UCSD-Peds: On this small-scale dataset, the method achieves an average AUC of 93.2% with TSN gray-scale, demonstrating robustness even with limited training data.

Implications and Future Work

The proposed method presents a significant advancement in the field of video anomaly detection. By transforming the weakly supervised anomaly detection problem into a supervised learning task with noisy labels, the approach leverages the strengths of fully supervised classifiers, thus enhancing detection accuracy and efficiency. Additionally, the innovative use of GCN for label noise cleaning provides a novel way to improve label quality, which is critical for the performance of supervised models.

Future developments could explore the following directions:

  • Scalability: Extending the proposed method to handle even larger datasets and more complex scenarios.
  • Real-Time Applications: Adapting the approach for real-time anomaly detection in video streams, addressing computational efficiency.
  • Incorporation of Additional Contextual Information: Leveraging more context from the videos (e.g., scene understanding) to improve the robustness and accuracy of anomaly detection.

Conclusion

This paper introduces a robust framework for weakly supervised anomaly detection by re-casting it as a supervised task under noisy labels. The novel use of GCN for cleaning label noise and the alternate optimization mechanism significantly enhance the accuracy and efficiency of the anomaly detection process, as evidenced by strong numerical results across multiple datasets. This work sets a new standard in the area and opens up avenues for further research and practical applications in intelligent surveillance and related fields.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.