Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Video Instance Shadow Detection Under the Sun and Sky (2211.12827v3)

Published 23 Nov 2022 in cs.CV

Abstract: Instance shadow detection, crucial for applications such as photo editing and light direction estimation, has undergone significant advancements in predicting shadow instances, object instances, and their associations. The extension of this task to videos presents challenges in annotating diverse video data and addressing complexities arising from occlusion and temporary disappearances within associations. In response to these challenges, we introduce ViShadow, a semi-supervised video instance shadow detection framework that leverages both labeled image data and unlabeled video data for training. ViShadow features a two-stage training pipeline: the first stage, utilizing labeled image data, identifies shadow and object instances through contrastive learning for cross-frame pairing. The second stage employs unlabeled videos, incorporating an associated cycle consistency loss to enhance tracking ability. A retrieval mechanism is introduced to manage temporary disappearances, ensuring tracking continuity. The SOBA-VID dataset, comprising unlabeled training videos and labeled testing videos, along with the SOAP-VID metric, is introduced for the quantitative evaluation of VISD solutions. The effectiveness of ViShadow is further demonstrated through various video-level applications such as video inpainting, instance cloning, shadow editing, and text-instructed shadow-object manipulation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. GPT-4 technical report. arXiv:2303.08774, 2023.
  2. Shadow removal using intensity surfaces and texture anchor points. IEEE Trans. Pattern Anal. Mach. Intell., pages 1202–1216, 2010.
  3. High-speed tracking-by-detection without using image information. In IEEE Int. Conf. Adv. Video Signal-Based Sur., pages 1–6, 2017.
  4. The 2018 DAVIS challenge on video object segmentation. arXiv:1803.00557, 2018.
  5. Triple-cooperative video shadow detection. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 2715–2724, 2021.
  6. A multi-task mean teacher for semi-supervised shadow detection. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 5611–5620, 2020.
  7. Masked-attention mask transformer for universal image segmentation. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2022.
  8. Learning shadow correspondence for video shadow detection. In Proc. Eur. Conf. Comput. Vis., pages 705–722, 2022.
  9. Entropy minimization for shadow removal. Int. Journal Comput. Vis., pages 35–57, 2009.
  10. On the removal of shadows from images. IEEE Trans. Pattern Anal. Mach. Intell., pages 59–68, 2005.
  11. Learning to track instances without video annotations. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 8680–8689, 2021.
  12. Imagebind: One embedding space to bind them all. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 15180–15190, 2023.
  13. Single-image shadow detection and removal using paired regions. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 2033–2040, 2011.
  14. Momentum contrast for unsupervised visual representation learning. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 9729–9738, 2020.
  15. Direction-aware spatial context features for shadow detection and removal. IEEE Trans. Pattern Anal. Mach. Intell., pages 2795–2808, 2020.
  16. Revisiting shadow detection: A new benchmark dataset for complex world. IEEE Trans. Image Process., pages 1925–1934, 2021.
  17. Direction-aware spatial context features for shadow detection. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 7454–7462, 2018.
  18. What characterizes a shadow boundary under the sun and sky? In Proc. IEEE Int. Conf. Comput. Vis., pages 898–905, 2011.
  19. Video instance segmentation using inter-frame communication transformers. In Proc. Int. Conf. Neural Inf. Process. Syst., pages 13352–13363, 2021.
  20. Space-time correspondence as a contrastive random walk. In Proc. Int. Conf. Neural Inf. Process. Syst., pages 19545–19560, 2020.
  21. When SAM meets shadow detection. arXiv:2305.11513, 2023.
  22. Detecting ground shadows in outdoor consumer photographs. In Proc. Eur. Conf. Comput. Vis., pages 322–335, 2010.
  23. Joint-task self-supervised learning for temporal correspondence. In Proc. Int. Conf. Neural Inf. Process. Syst., 2019.
  24. Towards an end-to-end framework for flow-guided video inpainting. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 17562–17571, 2022.
  25. OmnimatteRF: Robust omnimatte with 3d background modeling. In Proc. IEEE Int. Conf. Comput. Vis., pages 23471–23480, 2023.
  26. Feature pyramid networks for object detection. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 2117–2125, 2017.
  27. Microsoft COCO: Common objects in context. In Proc. Eur. Conf. Comput. Vis., pages 740–755, 2014.
  28. SCOTCH and SODA: A transformer video shadow detection framework. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 10449–10458, 2023.
  29. FuseFormer: Fusing fine-grained information in transformers for video inpainting. In Proc. IEEE Int. Conf. Comput. Vis., pages 14040–14049, 2021.
  30. Robust dynamic radiance fields. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 13–23, 2023.
  31. InternGPT: Solving vision-centric tasks by interacting with chatgpt beyond language. arXiv:2305.11513, 2023.
  32. Omnimatte: Associating objects and their effects in video. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 4507–4515, 2021.
  33. Video shadow detection via spatio-temporal interpolation consistency training. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 3116–3125, 2022.
  34. Illumination estimation and cast shadow detection through a higher-order graphical model. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 673–680, 2011.
  35. Occluded video instance segmentation: A benchmark. Int. Journal Comput. Vis., 2022.
  36. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE Trans. Pattern Anal. Mach. Intell., pages 1623–1637, 2020.
  37. High-Resolution image synthesis with latent diffusion models. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 10684–10695, 2022.
  38. Cast shadow segmentation using invariant color features. Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 238–259, 2004.
  39. RAFT: Recurrent all-pairs field transforms for optical flow. In Proc. Eur. Conf. Comput. Vis., pages 402–419, 2020.
  40. New spectrum ratio properties and features for shadow detection. Pattern Recognition, pages 85–96, 2016.
  41. AdelaiDet: A toolbox for instance-level recognition tasks. https://git.io/adelaidet, 2019.
  42. Conditional convolutions for instance segmentation. In Proc. Eur. Conf. Comput. Vis., pages 282–298, 2020.
  43. FCOS: Fully convolutional one-stage object detection. In Proc. IEEE Int. Conf. Comput. Vis., pages 9627–9636, 2019.
  44. Single-stage instance shadow detection with bidirectional relation learning. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 1–11, 2021.
  45. Instance shadow detection with a single-stage detector. IEEE Trans. Pattern Anal. Mach. Intell., pages 3259–3273, 2023.
  46. Instance shadow detection. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 1880–1889, 2020.
  47. Learning correspondence from the cycle-consistency of time. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 2566–2576, 2019.
  48. End-to-end video instance segmentation with transformers. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 8741–8750, 2021.
  49. SeqFormer: Sequential transformer for video instance segmentation. In Proc. Eur. Conf. Comput. Vis., pages 553–569, 2022.
  50. Silt: Shadow-aware iterative label tuning for learning to detect shadows from noisy labels. In Proc. IEEE Int. Conf. Comput. Vis., 2023.
  51. Video instance segmentation. In Proc. IEEE Int. Conf. Comput. Vis., pages 5188–5197, 2019.
  52. Shadow remover: Image shadow removal based on illumination recovering optimization. IEEE Trans. Image Process., pages 4623–4636, 2015.
  53. Distraction-aware shadow detection. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 5167–5176, 2019.
  54. ProPainter: Improving propagation and transformer for video inpainting. In Proc. IEEE Int. Conf. Comput. Vis., pages 10477–10486, 2023.
  55. Learning to recognize shadows in monochromatic natural images. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 223–230, 2010.
  56. Bidirectional feature pyramid network with recurrent attention residual modules for shadow detection. In Proc. Eur. Conf. Comput. Vis., pages 121–136, 2018.
Citations (1)

Summary

We haven't generated a summary for this paper yet.