Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 177 tok/s
Gemini 2.5 Pro 43 tok/s Pro
GPT-5 Medium 26 tok/s Pro
GPT-5 High 25 tok/s Pro
GPT-4o 119 tok/s Pro
Kimi K2 202 tok/s Pro
GPT OSS 120B 439 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

Image Augmentation with Controlled Diffusion for Weakly-Supervised Semantic Segmentation (2310.09760v3)

Published 15 Oct 2023 in cs.CV

Abstract: Weakly-supervised semantic segmentation (WSSS), which aims to train segmentation models solely using image-level labels, has achieved significant attention. Existing methods primarily focus on generating high-quality pseudo labels using available images and their image-level labels. However, the quality of pseudo labels degrades significantly when the size of available dataset is limited. Thus, in this paper, we tackle this problem from a different view by introducing a novel approach called Image Augmentation with Controlled Diffusion (IACD). This framework effectively augments existing labeled datasets by generating diverse images through controlled diffusion, where the available images and image-level labels are served as the controlling information. Moreover, we also propose a high-quality image selection strategy to mitigate the potential noise introduced by the randomness of diffusion models. In the experiments, our proposed IACD approach clearly surpasses existing state-of-the-art methods. This effect is more obvious when the amount of available data is small, demonstrating the effectiveness of our method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 4981–4990, 2018.
  2. What’s the point: Semantic segmentation with point supervision. In Eur. Conf. Comput. Vis., pages 549–565. Springer, 2016.
  3. Language models are few-shot learners. Advances in Neural Information Processing Systems, 33:1877–1901, 2020.
  4. John Canny. A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell., PAMI-8(6):679–698, 1986.
  5. Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell., 43(1):172–186, 2021.
  6. Weakly-supervised semantic segmentation via sub-category exploration. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 8991–9000, 2020.
  7. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell., 40(4):834–848, 2017.
  8. Self-supervised image-specific prototype exploration for weakly supervised semantic segmentation. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 4288–4298, 2022.
  9. Out-of-candidate rectification for weakly supervised semantic segmentation. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 23673–23684, 2023.
  10. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  11. Weakly supervised semantic segmentation by pixel-to-prototype contrast. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 4320–4329, 2022.
  12. The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis., 88:303–338, 2010.
  13. Semantic contours from inverse detectors. In Proc. IEEE Int. Conf. Comput. Vis., pages 991–998, 2011.
  14. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
  15. Integral object mining via online attention accumulation. In Proc. IEEE Int. Conf. Comput. Vis., pages 2070–2079, 2019.
  16. Semantic-aware superpixel for weakly supervised semantic segmentation. In AAAI Conf. Artif. Intell., volume 37, pages 1142–1150, 2023.
  17. Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In Eur. Conf. Comput. Vis., pages 695–711, 2016.
  18. Bbam: Bounding box attribution map for weakly supervised semantic and instance segmentation. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 2643–2652, 2021.
  19. Railroad is not a train: Saliency as pseudo-pixel supervision for weakly supervised semantic segmentation. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 5495–5505, 2021.
  20. Prefix-tuning: Optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190, 2021.
  21. Microsoft coco: Common objects in context. In Eur. Conf. Comput. Vis., pages 740–755. Springer, 2014.
  22. Scribblesup: Scribble-supervised convolutional networks for semantic segmentation. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 3159–3167, 2016.
  23. Gpt understands, too. AI Open, 2023.
  24. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
  25. Usage: A unified seed area generation paradigm for weakly supervised semantic segmentation. In Proc. IEEE Int. Conf. Comput. Vis., pages 624–634, 2023.
  26. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551, 2020.
  27. High-resolution image synthesis with latent diffusion models. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 10684–10695, 2022.
  28. Max pooling with vision transformers reconciles class and shape in weakly supervised semantic segmentation. In Eur. Conf. Comput. Vis., pages 446–463, 2022.
  29. Learning affinity from attention: End-to-end weakly-supervised semantic segmentation with transformers. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 16846–16855, 2022.
  30. Token contrast for weakly-supervised semantic segmentation. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 3093–3102, 2023.
  31. Autoprompt: Eliciting knowledge from language models with automatically generated prompts. arXiv preprint arXiv:2010.15980, 2020.
  32. Deep unsupervised learning using nonequilibrium thermodynamics. In Int. Conf. Mach. Learn., pages 2256–2265, 2015.
  33. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020.
  34. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020.
  35. Mining cross-image semantics for weakly supervised semantic segmentation. In Eur. Conf. Comput. Vis., pages 347–365. Springer, 2020.
  36. Object region mining with adversarial erasing: A simple classification to semantic segmentation approach. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 1568–1576, 2017.
  37. Hierarchical semantic contrast for weakly supervised semantic segmentation. In IJCAI, pages 1542–1550, 2023.
  38. Segformer: Simple and efficient design for semantic segmentation with transformers. Advances in Neural Information Processing Systems, 34:12077–12090, 2021.
  39. Multi-class token transformer for weakly supervised semantic segmentation. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 4310–4319, 2022.
  40. Self correspondence distillation for end-to-end weakly-supervised semantic segmentation. In AAAI Conf. Artif. Intell., volume 37, pages 3045–3053, 2023.
  41. Non-salient region object mining for weakly supervised semantic segmentation. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 2623–2632, 2021.
  42. Object-contextual representations for semantic segmentation. In Eur. Conf. Comput. Vis., pages 173–190. Springer, 2020.
  43. Adding conditional control to text-to-image diffusion models. arXiv preprint arXiv:2302.05543, 2023.
  44. Prompt, generate, then cache: Cascade of foundation models makes strong few-shot learners. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 15211–15222, 2023.
  45. Controllable generation from pre-trained language models via inverse prompting. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pages 2450–2460, 2021.
Citations (10)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.