Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Crossover Learning for Fast Online Video Instance Segmentation (2104.05970v1)

Published 13 Apr 2021 in cs.CV

Abstract: Modeling temporal visual context across frames is critical for video instance segmentation (VIS) and other video understanding tasks. In this paper, we propose a fast online VIS model named CrossVIS. For temporal information modeling in VIS, we present a novel crossover learning scheme that uses the instance feature in the current frame to pixel-wisely localize the same instance in other frames. Different from previous schemes, crossover learning does not require any additional network parameters for feature enhancement. By integrating with the instance segmentation loss, crossover learning enables efficient cross-frame instance-to-pixel relation learning and brings cost-free improvement during inference. Besides, a global balanced instance embedding branch is proposed for more accurate and more stable online instance association. We conduct extensive experiments on three challenging VIS benchmarks, \ie, YouTube-VIS-2019, OVIS, and YouTube-VIS-2021 to evaluate our methods. To our knowledge, CrossVIS achieves state-of-the-art performance among all online VIS methods and shows a decent trade-off between latency and accuracy. Code will be available to facilitate future research.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Shusheng Yang (16 papers)
  2. Yuxin Fang (15 papers)
  3. Xinggang Wang (163 papers)
  4. Yu Li (378 papers)
  5. Chen Fang (157 papers)
  6. Ying Shan (252 papers)
  7. Bin Feng (44 papers)
  8. Wenyu Liu (146 papers)
Citations (95)

Summary

We haven't generated a summary for this paper yet.