Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
91 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
o3 Pro
5 tokens/sec
GPT-4.1 Pro
15 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
Gemini 2.5 Flash Deprecated
12 tokens/sec
2000 character limit reached

Memory Aggregation Networks for Efficient Interactive Video Object Segmentation (2003.13246v1)

Published 30 Mar 2020 in cs.CV

Abstract: Interactive video object segmentation (iVOS) aims at efficiently harvesting high-quality segmentation masks of the target object in a video with user interactions. Most previous state-of-the-arts tackle the iVOS with two independent networks for conducting user interaction and temporal propagation, respectively, leading to inefficiencies during the inference stage. In this work, we propose a unified framework, named Memory Aggregation Networks (MA-Net), to address the challenging iVOS in a more efficient way. Our MA-Net integrates the interaction and the propagation operations into a single network, which significantly promotes the efficiency of iVOS in the scheme of multi-round interactions. More importantly, we propose a simple yet effective memory aggregation mechanism to record the informative knowledge from the previous interaction rounds, improving the robustness in discovering challenging objects of interest greatly. We conduct extensive experiments on the validation set of DAVIS Challenge 2018 benchmark. In particular, our MA-Net achieves the J@60 score of 76.1% without any bells and whistles, outperforming the state-of-the-arts with more than 2.7%.

Citations (67)

Summary

We haven't generated a summary for this paper yet.