Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ShapeFormer: Shape Prior Visible-to-Amodal Transformer-based Amodal Instance Segmentation (2403.11376v4)

Published 18 Mar 2024 in cs.CV

Abstract: Amodal Instance Segmentation (AIS) presents a challenging task as it involves predicting both visible and occluded parts of objects within images. Existing AIS methods rely on a bidirectional approach, encompassing both the transition from amodal features to visible features (amodal-to-visible) and from visible features to amodal features (visible-to-amodal). Our observation shows that the utilization of amodal features through the amodal-to-visible can confuse the visible features due to the extra information of occluded/hidden segments not presented in visible display. Consequently, this compromised quality of visible features during the subsequent visible-to-amodal transition. To tackle this issue, we introduce ShapeFormer, a decoupled Transformer-based model with a visible-to-amodal transition. It facilitates the explicit relationship between output segmentations and avoids the need for amodal-to-visible transitions. ShapeFormer comprises three key modules: (i) Visible-Occluding Mask Head for predicting visible segmentation with occlusion awareness, (ii) Shape-Prior Amodal Mask Head for predicting amodal and occluded masks, and (iii) Category-Specific Shape Prior Retriever aims to provide shape prior knowledge. Comprehensive experiments and extensive ablation studies across various AIS benchmarks demonstrate the effectiveness of our ShapeFormer. The code is available at: \url{https://github.com/UARK-AICV/ShapeFormer}

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. Unseen object amodal instance segmentation via hierarchical occlusion modeling. In ICRA, pages 5085–5092. IEEE, 2022.
  2. J. Duncan. Selective attention and the organization of visual information. Journal of experimental psychology: General, 113(4):501, 1984.
  3. Learning to see the invisible: End-to-end trainable amodal instance segmentation. In WACV, pages 1328–1336. IEEE, 2019.
  4. Mask r-cnn. In ICCV, pages 2961–2969, 2017.
  5. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  6. Learning vector quantized shape code for amodal blastomere instance segmentation. arXiv preprint arXiv:2012.00985, 2020.
  7. Mask transfiner for high-quality instance segmentation. In CVPR, pages 4412–4421, 2022.
  8. Deep occlusion-aware instance segmentation with overlapping bilayers. In CVPR, pages 4019–4028, 2021.
  9. A theory of visual interpolation in object perception. Cognitive psychology, 23(2):141–221, 1991.
  10. K. Li and J. Malik. Amodal instance segmentation. In ECCV, pages 677–693. Springer, 2016.
  11. Microsoft coco: Common objects in context. In ECCV, pages 740–755. Springer, 2014.
  12. R. Mohan and A. Valada. Amodal panoptic segmentation. In CVPR, pages 21023–21032, 2022.
  13. K. Nguyen and S. Todorovic. A weakly supervised amodal segmenter with boundary uncertainty estimation. In ICCV, pages 7396–7405, 2021.
  14. Amodal instance segmentation with kins dataset. In CVPR, pages 3014–3023, 2019.
  15. U-net: Convolutional networks for biomedical image segmentation. In MICCAI, pages 234–241. Springer, 2015.
  16. S. Ruder. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747, 2016.
  17. Regnet: Multimodal sensor registration using deep neural networks. In 2017 IEEE intelligent vehicles symposium (IV), pages 1803–1810. IEEE, 2017.
  18. Fcos: Fully convolutional one-stage object detection. In ICCV, pages 9627–9636, 2019.
  19. Aisformer: Amodal instance segmentation with transformer. arXiv preprint arXiv:2210.06323, 2022.
  20. Attention is all you need. NeurIPS, 30, 2017.
  21. Detectron2. https://github.com/facebookresearch/detectron2, 2019.
  22. Amodal segmentation based on visible region segmentation and shape prior. arXiv preprint arXiv:2012.05598, 2020.
  23. Amodal segmentation based on visible region segmentation and shape prior. In AAAI, volume 35, pages 2995–3003, 2021.
  24. Self-supervised amodal video object segmentation. NeurIPS, 35:6278–6291, 2022.
  25. Self-supervised scene de-occlusion. In CVPR, pages 3784–3792, 2020.
  26. Semantic amodal segmentation. In CVPR, pages 1464–1472, 2017.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Minh Tran (43 papers)
  2. Winston Bounsavy (2 papers)
  3. Khoa Vo (16 papers)
  4. Anh Nguyen (157 papers)
  5. Tri Nguyen (47 papers)
  6. Ngan Le (84 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.