Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Video is Worth 10,000 Words: Training and Benchmarking with Diverse Captions for Better Long Video Retrieval (2312.00115v2)

Published 30 Nov 2023 in cs.CV and cs.CL

Abstract: Existing long video retrieval systems are trained and tested in the paragraph-to-video retrieval regime, where every long video is described by a single long paragraph. This neglects the richness and variety of possible valid descriptions of a video, which could range anywhere from moment-by-moment detail to a single phrase summary. To provide a more thorough evaluation of the capabilities of long video retrieval systems, we propose a pipeline that leverages state-of-the-art LLMs to carefully generate a diverse set of synthetic captions for long videos. We validate this pipeline's fidelity via rigorous human inspection. We use synthetic captions from this pipeline to perform a benchmark of a representative set of video LLMs using long video datasets, and show that the models struggle on shorter captions. We show that finetuning on this data can both mitigate these issues (+2.8% R@1 over SOTA on ActivityNet with diverse captions), and even improve performance on standard paragraph-to-video retrieval (+1.0% R@1 on ActivityNet). We also use synthetic data from our pipeline as query expansion in the zero-shot setting (+3.4% R@1 on ActivityNet). We derive insights by analyzing failure cases for retrieval with short captions. For data access and other details, please refer to our project website at https://mgwillia.github.io/10k-words.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Matthew Gwilliam (13 papers)
  2. Michael Cogswell (19 papers)
  3. Meng Ye (47 papers)
  4. Karan Sikka (32 papers)
  5. Abhinav Shrivastava (120 papers)
  6. Ajay Divakaran (43 papers)
Github Logo Streamline Icon: https://streamlinehq.com

GitHub