Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Space-time Reinforcement Network for Video Object Segmentation (2405.04042v1)

Published 7 May 2024 in cs.CV and cs.AI

Abstract: Recently, video object segmentation (VOS) networks typically use memory-based methods: for each query frame, the mask is predicted by space-time matching to memory frames. Despite these methods having superior performance, they suffer from two issues: 1) Challenging data can destroy the space-time coherence between adjacent video frames. 2) Pixel-level matching will lead to undesired mismatching caused by the noises or distractors. To address the aforementioned issues, we first propose to generate an auxiliary frame between adjacent frames, serving as an implicit short-temporal reference for the query one. Next, we learn a prototype for each video object and prototype-level matching can be implemented between the query and memory. The experiment demonstrated that our network outperforms the state-of-the-art method on the DAVIS 2017, achieving a J&F score of 86.4%, and attains a competitive result 85.0% on YouTube VOS 2018. In addition, our network exhibits a high inference speed of 32+ FPS.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (20)
  1. P. Voigtlaender, Y. Chai, F. Schroff, H. Adam, B. Leibe, and L.-C. Chen, “Feelvos: Fast end-to-end embedding learning for video object segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 9481–9490.
  2. L. Fu, Z. Li, Q. Ye, H. Yin, Q. Liu, X. Chen, X. Fan, W. Yang, and G. Yang, “Learning robust discriminant subspace based on joint L2,psubscriptL2𝑝\mathrm{L}_{2,p}roman_L start_POSTSUBSCRIPT 2 , italic_p end_POSTSUBSCRIPT- and L2,ssubscriptL2𝑠\mathrm{L}_{2,s}roman_L start_POSTSUBSCRIPT 2 , italic_s end_POSTSUBSCRIPT-norm distance metrics,” IEEE transactions on neural networks and learning systems, vol. 33, no. 1, pp. 130–144, 2020.
  3. Z. Yang, Y. Wei, and Y. Yang, “Collaborative video object segmentation by foreground-background integration,” in European Conference on Computer Vision.   Springer, 2020, pp. 332–348.
  4. S. W. Oh, J.-Y. Lee, N. Xu, and S. J. Kim, “Video object segmentation using space-time memory networks,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 9226–9235.
  5. Z. Yang, Y. Wei, and Y. Yang, “Collaborative video object segmentation by multi-scale foreground-background integration,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 9, pp. 4701–4712, 2021.
  6. H. Xie, H. Yao, S. Zhou, S. Zhang, and W. Sun, “Efficient regional memory network for video object segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 1286–1295.
  7. Q. Huang, L. Shen, R. Zhang, J. Cheng, S. Ding, Z. Zhou, and Y. Wang, “Hdmixer: Hierarchical dependency with extendable patch for multivariate time series forecasting,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 11, 2024, pp. 12 608–12 616.
  8. H. Seong, S. W. Oh, J.-Y. Lee, S. Lee, S. Lee, and E. Kim, “Hierarchical memory matching network for video object segmentation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 12 889–12 898.
  9. Z. Yang, Y. Wei, and Y. Yang, “Associating objects with transformers for video object segmentation,” Advances in Neural Information Processing Systems, vol. 34, pp. 2491–2502, 2021.
  10. M. Li, L. Hu, Z. Xiong, B. Zhang, P. Pan, and D. Liu, “Recurrent dynamic embedding for video object segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 1332–1341.
  11. Y. Chen, D. Zhang, Z.-x. Yang, and E. Wu, “Robust and efficient memory network for video object segmentation,” arXiv preprint arXiv:2304.11840, 2023.
  12. Q. Ye, P. Huang, Z. Zhang, Y. Zheng, L. Fu, and W. Yang, “Multiview learning with robust double-sided twin svm,” IEEE transactions on Cybernetics, vol. 52, no. 12, pp. 12 745–12 758, 2021.
  13. H. K. Cheng, Y.-W. Tai, and C.-K. Tang, “Rethinking space-time networks with improved memory coverage for efficient video object segmentation,” Advances in Neural Information Processing Systems, vol. 34, pp. 11 781–11 794, 2021.
  14. Y. Chen, C. Hao, Z.-X. Yang, and E. Wu, “Fast target-aware learning for few-shot video object segmentation,” Science China Information Sciences, vol. 65, no. 8, p. 182104, 2022.
  15. H. K. Cheng and A. G. Schwing, “Xmem: Long-term video object segmentation with an atkinson-shiffrin memory model,” in European Conference on Computer Vision.   Springer, 2022, pp. 640–658.
  16. K. Park, S. Woo, S. W. Oh, I. S. Kweon, and J.-Y. Lee, “Per-clip video object segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 1352–1361.
  17. Y. Chen, D. Zhang, Y. Zheng, Z.-X. Yang, E. Wu, and H. Zhao, “Boosting video object segmentation via robust and efficient memory network,” IEEE Transactions on Circuits and Systems for Video Technology, 2023.
  18. X. Zhu, H. Hu, S. Lin, and J. Dai, “Deformable convnets v2: More deformable, better results,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 9308–9316.
  19. N. Xu, L. Yang, Y. Fan, D. Yue, Y. Liang, J. Yang, and T. Huang, “Youtube-vos: A large-scale video object segmentation benchmark,” arXiv preprint arXiv:1809.03327, 2018.
  20. J. Pont-Tuset, F. Perazzi, S. Caelles, P. Arbeláez, A. Sorkine-Hornung, and L. Van Gool, “The 2017 davis challenge on video object segmentation,” arXiv preprint arXiv:1704.00675, 2017.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Yadang Chen (2 papers)
  2. Wentao Zhu (73 papers)
  3. Zhi-Xin Yang (16 papers)
  4. Enhua Wu (23 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com