Large Language Models for Crash Detection in Video: A Survey of Methods, Datasets, and Challenges (2507.02074v1)
Abstract: Crash detection from video feeds is a critical problem in intelligent transportation systems. Recent developments in LLMs and vision-LLMs (VLMs) have transformed how we process, reason about, and summarize multimodal information. This paper surveys recent methods leveraging LLMs for crash detection from video data. We present a structured taxonomy of fusion strategies, summarize key datasets, analyze model architectures, compare performance benchmarks, and discuss ongoing challenges and opportunities. Our review provides a foundation for future research in this fast-growing intersection of video understanding and foundation models.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.