Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation (2308.14710v1)

Published 28 Aug 2023 in cs.CV, cs.AI, and cs.LG

Abstract: Existing approaches to unsupervised video instance segmentation typically rely on motion estimates and experience difficulties tracking small or divergent motions. We present VideoCutLER, a simple method for unsupervised multi-instance video segmentation without using motion-based learning signals like optical flow or training on natural videos. Our key insight is that using high-quality pseudo masks and a simple video synthesis method for model training is surprisingly sufficient to enable the resulting video model to effectively segment and track multiple instances across video frames. We show the first competitive unsupervised learning results on the challenging YouTubeVIS-2019 benchmark, achieving 50.7% APvideo50 , surpassing the previous state-of-the-art by a large margin. VideoCutLER can also serve as a strong pretrained model for supervised video instance segmentation tasks, exceeding DINO by 15.9% on YouTubeVIS-2019 in terms of APvideo.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Xudong Wang (113 papers)
  2. Ishan Misra (65 papers)
  3. Ziyun Zeng (16 papers)
  4. Rohit Girdhar (43 papers)
  5. Trevor Darrell (324 papers)
Citations (9)