Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Video Anomaly Detection and Explanation via Large Language Models (2401.05702v1)

Published 11 Jan 2024 in cs.CV

Abstract: Video Anomaly Detection (VAD) aims to localize abnormal events on the timeline of long-range surveillance videos. Anomaly-scoring-based methods have been prevailing for years but suffer from the high complexity of thresholding and low explanability of detection results. In this paper, we conduct pioneer research on equipping video-based LLMs (VLLMs) in the framework of VAD, making the VAD model free from thresholds and able to explain the reasons for the detected anomalies. We introduce a novel network module Long-Term Context (LTC) to mitigate the incapability of VLLMs in long-range context modeling. We design a three-phase training method to improve the efficiency of fine-tuning VLLMs by substantially minimizing the requirements for VAD data and lowering the costs of annotating instruction-tuning data. Our trained model achieves the top performance on the anomaly videos of the UCF-Crime and TAD benchmarks, with the AUC improvements of +3.86\% and +4.96\%, respectively. More impressively, our approach can provide textual explanations for detected anomalies.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Frozen in time: A joint video and image encoder for end-to-end retrieval. In ICCV, pages 1728–1738, 2021.
  2. An empirical study of training end-to-end vision-and-language transformers. In CVPR, 2022.
  3. Eva: Exploring the limits of masked visual representation learning at scale. In CVPR, 2023.
  4. Llama-adapter v2: Parameter-efficient visual instruction model. ArXiv, 2023.
  5. An anomaly-introduced learning method for abnormal event detection. Multimedia Tools and Applications, 2018.
  6. A survey on explainable anomaly detection for industrial internet of things. In DSC, 2022.
  7. Traffic monitoring and accident detection at intersections. ITITS, 2000.
  8. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. ArXiv, 2023.
  9. Videochat: Chat-centric video understanding. ArXiv, 2023.
  10. Self-training multi-sequence learning with transformer for weakly supervised video anomaly detection. AAAI, 2022.
  11. Decoupled weight decay regularization. ICLR, 2019.
  12. Global information guided video anomaly detection. In ACM MM, 2020.
  13. Unbiased multiple instance learning for weakly supervised video anomaly detection. In CVPR, 2023.
  14. Localizing anomalies from weakly-labeled videos. TIP, 2021.
  15. Video-chatgpt: Towards detailed video understanding via large vision and language models. ArXiv, 2023.
  16. Angry crowds: Detecting violent events in videos. In ECCV, 2016.
  17. Pandagpt: One model to instruction-follow them all. ArXiv, 2023.
  18. Real-world anomaly detection in surveillance videos. In CVPR, 2018.
  19. Weakly-supervised video anomaly detection with robust temporal feature magnitude learning. In ICCV, 2021.
  20. Learning causal temporal relation and feature discrimination for anomaly detection. TIP, 2021.
  21. Not only look, but also listen: Learning multimodal violence detection under weak supervision. In ECCV, 2020.
  22. Claws: Clustering assisted weakly supervised learning with normalcy suppression for anomalous event detection. In ECCV, 2020.
  23. Exploiting completeness and uncertainty of pseudo labels for weakly supervised video anomaly detection. In CVPR, 2023.
  24. Video-llama: An instruction-tuned audio-visual language model for video understanding. ArXiv, 2023.
  25. Temporal convolutional network with complementary inner bag loss for weakly supervised anomaly detection. In ICIP, 2019.
  26. Judging llm-as-a-judge with mt-bench and chatbot arena. ArXiv, 2023.
  27. Graph convolutional label noise cleaner: Train a plug-and-play action classifier for anomaly detection. In CVPR, 2019.
  28. Dual memory units with uncertainty regulation for weakly supervised video anomaly detection. AAAI, 2023.
  29. Motion-aware feature for improved video anomaly detection. BMVC, 2019.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Hui Lv (9 papers)
  2. Qianru Sun (65 papers)
Citations (10)