Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Moving Object Proposals with Deep Learned Optical Flow for Video Object Segmentation (2402.08882v1)

Published 14 Feb 2024 in cs.CV and cs.LG

Abstract: Dynamic scene understanding is one of the most conspicuous field of interest among computer vision community. In order to enhance dynamic scene understanding, pixel-wise segmentation with neural networks is widely accepted. The latest researches on pixel-wise segmentation combined semantic and motion information and produced good performance. In this work, we propose a state of art architecture of neural networks to accurately and efficiently get the moving object proposals (MOP). We first train an unsupervised convolutional neural network (UnFlow) to generate optical flow estimation. Then we render the output of optical flow net to a fully convolutional SegNet model. The main contribution of our work is (1) Fine-tuning the pretrained optical flow model on the brand new DAVIS Dataset; (2) Leveraging fully convolutional neural networks with Encoder-Decoder architecture to segment objects. We developed the codes with TensorFlow, and executed the training and evaluation processes on an AWS EC2 instance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence, 39(12):2481–2495, 2017.
  2. P. Bideau and E. Learned-Miller. It’s moving! a probabilistic model for causal motion segmentation in moving camera videos. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VIII 14, pages 433–449. Springer, 2016.
  3. Color image segmentation: advances and prospects. Pattern recognition, 34(12):2259–2281, 2001.
  4. Flownet: Learning optical flow with convolutional networks. In Proceedings of the IEEE international conference on computer vision, pages 2758–2766, 2015.
  5. Fusionseg: Learning to combine motion and appearance for fully automatic segmentation of generic objects in videos. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3664–3673, 2017.
  6. Stfcn: spatio-temporal fcn for semantic video segmentation. arXiv preprint arXiv:1608.05971, 2016.
  7. Learning to segment moving objects in videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4083–4090, 2015.
  8. Joint semantic and motion segmentation for dynamic scenes using deep convolutional networks. arXiv preprint arXiv:1704.08331, 2017.
  9. Determining optical flow. Artificial intelligence, 17(1-3):185–203, 1981.
  10. Flownet 2.0: Evolution of optical flow estimation with deep networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2462–2470, 2017.
  11. Unflow: Unsupervised learning of optical flow with a bidirectional census loss. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018.
  12. Learning to segment object candidates. Advances in neural information processing systems, 28, 2015.
  13. K. O’Shea and R. Nash. An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458, 2015.
  14. The 2017 davis challenge on video object segmentation. arXiv preprint arXiv:1704.00675, 2017.
  15. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779–788, 2016.
  16. Unsupervised deep learning for optical flow estimation. In Proceedings of the AAAI conference on artificial intelligence, volume 31, 2017.
  17. Convolutional lstm network: A machine learning approach for precipitation nowcasting. Advances in neural information processing systems, 28, 2015.
  18. K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
  19. Secrets of optical flow estimation and their principles. In 2010 IEEE computer society conference on computer vision and pattern recognition, pages 2432–2439. IEEE, 2010.
  20. Learning motion patterns in videos. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3386–3394, 2017.
  21. Learning video object segmentation with visual memory. In Proceedings of the IEEE International Conference on Computer Vision, pages 4481–4490, 2017.
  22. Y. J. Zhang. A survey on evaluation methods for image segmentation. Pattern recognition, 29(8):1335–1346, 1996.
Citations (1)

Summary

We haven't generated a summary for this paper yet.