Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Zero-Shot Monocular Motion Segmentation in the Wild by Combining Deep Learning with Geometric Motion Model Fusion (2405.01723v1)

Published 2 May 2024 in cs.CV and cs.AI

Abstract: Detecting and segmenting moving objects from a moving monocular camera is challenging in the presence of unknown camera motion, diverse object motions and complex scene structures. Most existing methods rely on a single motion cue to perform motion segmentation, which is usually insufficient when facing different complex environments. While a few recent deep learning based methods are able to combine multiple motion cues to achieve improved accuracy, they depend heavily on vast datasets and extensive annotations, making them less adaptable to new scenarios. To address these limitations, we propose a novel monocular dense segmentation method that achieves state-of-the-art motion segmentation results in a zero-shot manner. The proposed method synergestically combines the strengths of deep learning and geometric model fusion methods by performing geometric model fusion on object proposals. Experiments show that our method achieves competitive results on several motion segmentation datasets and even surpasses some state-of-the-art supervised methods on certain benchmarks, while not being trained on any data. We also present an ablation study to show the effectiveness of combining different geometric models together for motion segmentation, highlighting the value of our geometric model fusion strategy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. On the Usage of the Trifocal Tensor in Motion Segmentation. In Computer Vision – ECCV 2020, pages 514–530. Springer International Publishing, Cham, 2020. Series Title: Lecture Notes in Computer Science.
  2. Progressive-X: Efficient, Anytime, Multi-Model Fitting Algorithm. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 3779–3787, Seoul, Korea (South), 2019. IEEE.
  3. It’s Moving! A Probabilistic Model for Causal Motion Segmentation in Moving Camera Videos. In Computer Vision – ECCV 2016, pages 433–449. Springer International Publishing, Cham, 2016. Series Title: Lecture Notes in Computer Science.
  4. The Best of Both Worlds: Combining CNNs and Geometric Constraints for Hierarchical Motion Segmentation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 508–517, Salt Lake City, UT, USA, 2018. IEEE.
  5. MoA-Net: Self-supervised Motion Segmentation. In Computer Vision – ECCV 2018 Workshops, pages 715–730. Springer International Publishing, Cham, 2019. Series Title: Lecture Notes in Computer Science.
  6. Markus Bosch. Deep Learning for Robust Motion Segmentation with Non-Static Cameras, 2021. arXiv:2102.10929 [cs].
  7. Object segmentation by long term analysis of point trajectories. In Proceedings of the 11th European conference on Computer vision: Part V, pages 282–295, Berlin, Heidelberg, 2010. Springer-Verlag.
  8. Learning Independent Object Motion From Unlabelled Stereoscopic Videos. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5587–5596, Long Beach, CA, USA, 2019. IEEE.
  9. Segment and Track Anything, 2023. arXiv:2305.06558 [cs].
  10. The Ordered Residual Kernel for Robust Motion Subspace Clustering. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2009.
  11. Towards Segmenting Anything That Moves. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pages 1493–1502, Seoul, Korea (South), 2019. IEEE.
  12. Fast approximate energy minimization with label costs. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 2173–2180, 2010. ISSN: 1063-6919.
  13. Sparse Subspace Clustering: Algorithm, Theory, and Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(11):2765–2781, 2013.
  14. EpO-Net: Exploiting Geometric Constraints on Dense Trajectories for Motion Saliency. In 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1873–1882, Snowmass Village, CO, USA, 2020. IEEE.
  15. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381–395, 1981.
  16. Video segmentation by tracing discontinuities in a trajectory embedding. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 1846–1853, Providence, RI, 2012. IEEE.
  17. Particle Video Revisited: Tracking Through Occlusions Using Point Trajectories. In Computer Vision – ECCV 2022, pages 59–75. Springer Nature Switzerland, Cham, 2022. Series Title: Lecture Notes in Computer Science.
  18. Multiple View Geometry in Computer Vision. Cambridge University Press, ISBN: 0521540518, second edition, 2004.
  19. On Moving Object Segmentation from Monocular Video with Transformers.
  20. Motion Segmentation from a Moving Monocular Camera. In IROS 2023 Workshop on Robotic Perception and Mapping: Frontier Vision and Learning Techniques. arXiv, 2023a. arXiv:2309.13772 [cs].
  21. A Unified Model Selection Technique for Spectral Clustering Based Motion Segmentation. Journal of Computational Vision and Imaging Systems, 9(1), 2023b.
  22. Dense Monocular Motion Segmentation Using Optical Flow and Pseudo Depth Map: A Zero-Shot Approach. In 21st Conference on Robots and Vision (CRV), Guelph, ON, Canada, 2024. IEEE.
  23. Object Segmentation by Long Term Analysis of Point Trajectories. In Computer Vision – ECCV 2010, pages 282–295. Springer Berlin Heidelberg, Berlin, Heidelberg, 2010. Series Title: Lecture Notes in Computer Science.
  24. Energy-Based Geometric Multi-model Fitting. International Journal of Computer Vision, 97(2):123–147, 2012.
  25. Jianbo Shi and Tomasi. Good features to track. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR-94, pages 593–600, Seattle, WA, USA, 1994. IEEE Comput. Soc. Press.
  26. What to Select: Pursuing Consistent Motion Segmentation from Multiple Geometric Models. Proceedings of the AAAI Conference on Artificial Intelligence, 35(2):1708–1716, 2021. Number: 2.
  27. Segment Anything in High Quality, 2023. arXiv:2306.01567 [cs].
  28. Motion Trajectory Segmentation via Minimum Cost Multicuts. In 2015 IEEE International Conference on Computer Vision (ICCV), pages 3271–3279, 2015. ISSN: 2380-7504.
  29. Co-regularized Multi-view Spectral Clustering. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2011.
  30. Motion Segmentation Via a Sparsity Constraint. IEEE Transactions on Intelligent Transportation Systems, 18(4):973–983, 2017. Conference Name: IEEE Transactions on Intelligent Transportation Systems.
  31. Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection, 2023. arXiv:2303.05499 [cs].
  32. The Interpretation of a Moving Retinal Image. Proceedings of the Royal Society of London. Series B, Biological Sciences, 208(1173):385–397, 1980. Publisher: The Royal Society.
  33. EM-Driven Unsupervised Learning for Efficient Motion Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4):4462–4473, 2023. Conference Name: IEEE Transactions on Pattern Analysis and Machine Intelligence.
  34. Computer Vision Analysis of Image Motion by Variational Methods. Springer International Publishing, Cham, 2014.
  35. Monocular Instance Motion Segmentation for Autonomous Driving: KITTI InstanceMotSeg Dataset and Multi-Task Baseline. In 2021 IEEE Intelligent Vehicles Symposium (IV), pages 114–121, Nagoya, Japan, 2021. IEEE Press.
  36. Michal Neoral. Monocular Arbitrary Moving Object Discovery and Segmentation.
  37. Segmentation of Moving Objects by Long Term Video Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(6):1187–1200, 2014. Conference Name: IEEE Transactions on Pattern Analysis and Machine Intelligence.
  38. DINOv2: Learning Robust Visual Features without Supervision, 2023.
  39. V.: Fast object segmentation in unconstrained video. In In: ICCV (2013.
  40. A simple and fast algorithm for K-medoids clustering. Expert Systems with Applications, 36(2):3336–3341, 2009.
  41. The 2017 DAVIS Challenge on Video Object Segmentation, 2018. arXiv:1704.00675 [cs].
  42. Segment Anything Meets Point Tracking, 2023.
  43. RST-MODNet: Real-time Spatio-temporal Moving Object Detection for Autonomous Driving, 2019. arXiv:1912.00438 [cs, stat] version: 1.
  44. Motion Segmentation in the Presence of Outlying, Incomplete, or Corrupted Trajectories. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(10):1832–1845, 2010. Conference Name: IEEE Transactions on Pattern Analysis and Machine Intelligence.
  45. A variational method for the recovery of dense 3D structure from motion. Robotics and Autonomous Systems, 55(7):597–607, 2007.
  46. MODNet: Motion and Appearance based Moving Object Detection Network for Autonomous Driving. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pages 2859–2864, 2018. ISSN: 2153-0017.
  47. Disentangling Architecture and Training for Optical Flow. In Computer Vision – ECCV 2022, pages 165–182, Cham, 2022. Springer Nature Switzerland.
  48. A Benchmark for the Comparison of 3-D Motion Segmentation Algorithms. In 2007 IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8, 2007. ISSN: 1063-6919.
  49. SMSnet: Semantic motion segmentation using deep convolutional neural networks. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 582–589, 2017. ISSN: 2153-0866.
  50. Rene Vidal. Subspace Clustering. IEEE Signal Processing Magazine, 28(2):52–68, 2011.
  51. Ulrike Von Luxburg. A tutorial on spectral clustering. Statistics and Computing, 17(4):395–416, 2007.
  52. Detection and Segmentation of Independently Moving Objects from Dense Scene Flow. In Energy Minimization Methods in Computer Vision and Pattern Recognition, pages 14–27. Springer Berlin Heidelberg, Berlin, Heidelberg, 2009a. Series Title: Lecture Notes in Computer Science.
  53. Detection and Segmentation of Independently Moving Objects from Dense Scene Flow. pages 14–27, 2009b.
  54. YouTube-VOS: Sequence-to-Sequence Video Object Segmentation. In Computer Vision – ECCV 2018, pages 603–619. Springer International Publishing, Cham, 2018a. Series Title: Lecture Notes in Computer Science.
  55. Motion Segmentation by Exploiting Complementary Geometric Models. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2859–2867, Salt Lake City, UT, USA, 2018b. IEEE.
  56. Learning to Segment Rigid Motions from Two Frames. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1266–1275, Nashville, TN, USA, 2021. IEEE.
  57. Decoupling Features in Hierarchical Propagation for Video Object Segmentation. Advances in Neural Information Processing Systems, 35:36324–36336, 2022.
  58. Recognize Anything: A Strong Image Tagging Model, 2023. arXiv:2306.03514 [cs].
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Yuxiang Huang (17 papers)
  2. Yuhao Chen (84 papers)
  3. John Zelek (31 papers)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com