Per-Clip Video Object Segmentation (2208.01924v1)

Published 3 Aug 2022 in cs.CV

Abstract: Recently, memory-based approaches show promising results on semi-supervised video object segmentation. These methods predict object masks frame-by-frame with the help of frequently updated memory of the previous mask. Different from this per-frame inference, we investigate an alternative perspective by treating video object segmentation as clip-wise mask propagation. In this per-clip inference scheme, we update the memory with an interval and simultaneously process a set of consecutive frames (i.e. clip) between the memory updates. The scheme provides two potential benefits: accuracy gain by clip-level optimization and efficiency gain by parallel computation of multiple frames. To this end, we propose a new method tailored for the per-clip inference. Specifically, we first introduce a clip-wise operation to refine the features based on intra-clip correlation. In addition, we employ a progressive matching mechanism for efficient information-passing within a clip. With the synergy of two modules and a newly proposed per-clip based training, our network achieves state-of-the-art performance on Youtube-VOS 2018/2019 val (84.6% and 84.6%) and DAVIS 2016/2017 val (91.9% and 86.1%). Furthermore, our model shows a great speed-accuracy trade-off with varying memory update intervals, which leads to huge flexibility.

Authors (5)

Kwanyong Park (15 papers)
Sanghyun Woo (31 papers)
Seoung Wug Oh (33 papers)
In So Kweon (156 papers)
Joon-Young Lee (61 papers)

Citations (47)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Per-Clip Video Object Segmentation (2208.01924v1)

Summary

Related Papers