Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sequential Amodal Segmentation via Cumulative Occlusion Learning (2405.05791v1)

Published 9 May 2024 in cs.CV

Abstract: To fully understand the 3D context of a single image, a visual system must be able to segment both the visible and occluded regions of objects, while discerning their occlusion order. Ideally, the system should be able to handle any object and not be restricted to segmenting a limited set of object classes, especially in robotic applications. Addressing this need, we introduce a diffusion model with cumulative occlusion learning designed for sequential amodal segmentation of objects with uncertain categories. This model iteratively refines the prediction using the cumulative mask strategy during diffusion, effectively capturing the uncertainty of invisible regions and adeptly reproducing the complex distribution of shapes and occlusion orders of occluded objects. It is akin to the human capability for amodal perception, i.e., to decipher the spatial ordering among objects and accurately predict complete contours for occluded objects in densely layered visual scenes. Experimental results across three amodal datasets show that our method outperforms established baselines.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Amodal intra-class instance segmentation: Synthetic datasets and benchmark. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 281–290, 2024.
  2. Unseen object amodal instance segmentation via hierarchical occlusion modeling. In 2022 International Conference on Robotics and Automation (ICRA), pages 5085–5092. IEEE, 2022.
  3. Cascadepsp: Toward class-agnostic and very high-resolution segmentation via global and local refinement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8890–8899, 2020.
  4. Diffedit: Diffusion-based semantic image editing with mask guidance. In The Eleventh International Conference on Learning Representations (ICLR), 2023.
  5. Segan: Segmenting and generating the invisible. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 6144–6153, 2018.
  6. Learning to see the invisible: End-to-end trainable amodal instance segmentation. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1328–1336. IEEE, 2019.
  7. Taylor neural network for unseen object instance segmentation in hierarchical grasping. IEEE/ASME Transactions on Mechatronics, 2024.
  8. Coarse-to-fine amodal segmentation with shape prior. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 1262–1271, 2023a.
  9. Modeling multimodal aleatoric uncertainty in segmentation with mixture of stochastic experts. In The Eleventh International Conference on Learning Representations (ICLR), 2023b.
  10. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pages 2961–2969, 2017.
  11. Denoising diffusion probabilistic models. Advances in neural information processing systems (NeurIPS), 33:6840–6851, 2020.
  12. Deep occlusion-aware instance segmentation with overlapping bilayers. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4019–4028, 2021.
  13. Pointrend: Image segmentation as rendering. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pages 9799–9808, 2020.
  14. Instance-wise occlusion and depth orders in natural scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 21210–21221, 2022.
  15. Muva: A new large-scale benchmark for multi-view amodal instance segmentation in the shopping scenario. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 23504–23513, October 2023a.
  16. Open-vocabulary object segmentation with diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 7667–7676, October 2023b.
  17. Blade: Box-level supervised amodal segmentation through directed expansion. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 3846–3854, Mar. 2024.
  18. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
  19. Amodal panoptic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 21023–21032, 2022.
  20. A weakly supervised amodal segmenter with boundary uncertainty estimation. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 7376–7385, 2021.
  21. Improved denoising diffusion probabilistic models. In International Conference on Machine Learning (ICML), pages 8162–8171. PMLR, 2021.
  22. Amodal instance segmentation with kins dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3014–3023, 2019.
  23. Ca-ssl: Class-agnostic semi-supervised learning for detection and segmentation. In European Conference on Computer Vision (ECCV), pages 59–77. Springer, 2022.
  24. Distilling part-whole hierarchical knowledge from a huge pretrained class agnostic segmentation framework. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 238–246, 2023.
  25. Ambiguous medical image segmentation using diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11536–11546, 2023.
  26. Walt: Watch and learn 2d amodal representation from time-lapse imagery. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9346–9356, 2022.
  27. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems (NeurIPS), 28, 2015.
  28. Early completion of occluded objects. Vision research, 38(15-16):2489–2505, 1998.
  29. Video class agnostic segmentation benchmark for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2825–2834, 2021.
  30. Amodal segmentation through out-of-task and out-of-distribution generalization with a bayesian model. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1205–1214, 2022.
  31. Data-driven robotic visual grasping detection for unknown objects: A problem-oriented review. Expert Systems with Applications, 211:118624, 2023.
  32. Aisformer: Amodal instance segmentation with transformer. In 33rd British Machine Vision Conference 2022, BMVC 2022, London, UK, November 21-24, 2022.
  33. Joint learning of instance and semantic segmentation for robotic pick-and-place with heavy occlusions in clutter. In 2019 International Conference on Robotics and Automation (ICRA), pages 9558–9564, 2019.
  34. Diffusion models for implicit image segmentation ensembles. In International Conference on Medical Imaging with Deep Learning (MIDL), pages 1336–1348. PMLR, 2022.
  35. Medsegdiff: Medical image segmentation with diffusion probabilistic model. In Medical Imaging with Deep Learning (MIDL), 2023.
  36. Open-vocabulary panoptic segmentation with text-to-image diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2955–2966, June 2023.
  37. Robust instance segmentation through reasoning about multi-object occlusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11141–11150, 2021.
  38. Stochastic segmentation with conditional categorical diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 1119–1129, 2023.
  39. Self-supervised scene de-occlusion. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pages 3784–3792, 2020.
  40. Visiting the invisible: Layer-by-layer completed scene decomposition. International Journal of Computer Vision (IJCV), 129:3195–3215, 2021.
  41. Semantic amodal segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1464–1472, 2017.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com