Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 94 tok/s
Gemini 2.5 Pro 57 tok/s Pro
GPT-5 Medium 28 tok/s
GPT-5 High 38 tok/s Pro
GPT-4o 100 tok/s
GPT OSS 120B 461 tok/s Pro
Kimi K2 208 tok/s Pro
2000 character limit reached

Amodal Completion via Progressive Mixed Context Diffusion (2312.15540v1)

Published 24 Dec 2023 in cs.CV

Abstract: Our brain can effortlessly recognize objects even when partially hidden from view. Seeing the visible of the hidden is called amodal completion; however, this task remains a challenge for generative AI despite rapid progress. We propose to sidestep many of the difficulties of existing approaches, which typically involve a two-step process of predicting amodal masks and then generating pixels. Our method involves thinking outside the box, literally! We go outside the object bounding box to use its context to guide a pre-trained diffusion inpainting model, and then progressively grow the occluded object and trim the extra background. We overcome two technical challenges: 1) how to be free of unwanted co-occurrence bias, which tends to regenerate similar occluders, and 2) how to judge if an amodal completion has succeeded. Our amodal completion method exhibits improved photorealistic completion results compared to existing approaches in numerous successful completion cases. And the best part? It doesn't require any special training or fine-tuning of models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. Amodal intra-class instance segmentation: New dataset and benchmark, 2023.
  2. Oconet: Image extrapolation by object completion. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2307–2317, 2021.
  3. Monet: Unsupervised scene decomposition and representation, 2019.
  4. Training-free layout control with cross-attention guidance, 2023.
  5. Inout: Diverse image outpainting via gan inversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11431–11440, 2022.
  6. Object-driven multi-layer scene decomposition from a single image, 2019.
  7. Diffusion models beat gans on image synthesis, 2021.
  8. Segan: Segmenting and generating the invisible, 2018.
  9. Genesis: Generative scene inference and sampling with object-centric latent representations, 2020.
  10. Training-free structured diffusion guidance for compositional text-to-image synthesis, 2023.
  11. Dreamsim: Learning new dimensions of human visual similarity using synthetic data. arXiv:2306.09344, 2023.
  12. Multi-object representation learning with iterative variational inference, 2020.
  13. Prompt-to-prompt image editing with cross attention control. arXiv preprint arXiv:2208.01626, 2022.
  14. Denoising diffusion probabilistic models, 2020.
  15. Sail-vos: Semantic amodal instance level video object segmentation-a synthetic dataset and baselines. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3105–3115, 2019.
  16. Segment anything, 2023.
  17. Openimages: A public dataset for large-scale multi-label and multi-class image classification. Dataset available from https://storage.googleapis.com/openimages/web/index.html, 2017.
  18. The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale. IJCV, 2020.
  19. Instance-wise occlusion and depth orders in natural scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21210–21221, 2022.
  20. Contextual outpainting with object-level contrastive learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11451–11460, 2022.
  21. Amodal instance segmentation. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14, pages 677–693. Springer, 2016.
  22. Controllable and progressive image extrapolation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 2140–2149, 2021.
  23. Microsoft coco: Common objects in context, 2015.
  24. Variational amodal object completion. In Advances in Neural Information Processing Systems, pages 16246–16257. Curran Associates, Inc., 2020.
  25. Zero-1-to-3: Zero-shot one image to 3d object, 2023a.
  26. Grounding dino: Marrying dino with grounded pre-training for open-set object detection, 2023b.
  27. S. Lloyd. Least squares quantization in pcm. IEEE Transactions on Information Theory, 28(2):129–137, 1982.
  28. Sdedit: Guided image synthesis and editing with stochastic differential equations, 2022.
  29. Dragondiffusion: Enabling drag-style manipulation on diffusion models, 2023.
  30. Improved denoising diffusion probabilistic models, 2021.
  31. Glide: Towards photorealistic image generation and editing with text-guided diffusion models, 2022.
  32. Counterfactual image networks. 2018.
  33. How to make a pizza: Learning a compositional layer-based gan model, 2019.
  34. Localizing object-level shape variations with text-to-image diffusion models, 2023.
  35. Amodal instance segmentation with kins dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3014–3023, 2019.
  36. Learning transferable visual models from natural language supervision, 2021.
  37. Hierarchical text-conditional image generation with clip latents, 2022.
  38. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695, 2022.
  39. U-net: Convolutional networks for biomedical image segmentation, 2015.
  40. Photorealistic text-to-image diffusion models with deep language understanding, 2022.
  41. Deep unsupervised learning using nonequilibrium thermodynamics, 2015.
  42. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020.
  43. Resolution-robust large mask inpainting with fourier convolutions, 2021.
  44. Emergent correspondence from image diffusion, 2023.
  45. Robert van Lier. Investigating global effects in visual occlusion: from a partly occluded square to the back of a tree-trunk. Acta Psychologica, 102:203–220, 1999.
  46. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  47. Structure-guided image outpainting, 2022.
  48. Wide-context semantic image extrapolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  49. Image outpainting: Hallucinating beyond the image. IEEE Access, 8:173576–173583, 2020.
  50. Visualizing the invisible: Occluded vehicle segmentation and recovery, 2019.
  51. Paint by example: Exemplar-based image editing with diffusion models, 2022.
  52. Inst-inpaint: Instructing to remove objects with diffusion models, 2023.
  53. Temporal properties of amodal completion: Influences of knowledge. Vision research, 145, 2018.
  54. What does stable diffusion know about the 3d scene? arXiv preprint arXiv:2310.06836, 2023.
  55. Self-supervised scene de-occlusion, 2020.
  56. A tale of two features: Stable diffusion complements dino for zero-shot semantic correspondence. 2023a.
  57. Adding conditional control to text-to-image diffusion models, 2023.
  58. Perceptual artifacts localization for image synthesis tasks, 2023b.
  59. The unreasonable effectiveness of deep features as a perceptual metric, 2018.
  60. Visiting the invisible: Layer-by-layer completed scene decomposition, 2021.
  61. Human de-occlusion: Invisible perception and recovery for humans. In Computer Vision and Pattern Recognition (CVPR), 2021.
  62. Semantic amodal segmentation. In Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
Citations (7)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.