Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Weakly-supervised Semantic Segmentation via Dual-stream Contrastive Learning of Cross-image Contextual Information (2405.04913v1)

Published 8 May 2024 in cs.CV

Abstract: Weakly supervised semantic segmentation (WSSS) aims at learning a semantic segmentation model with only image-level tags. Despite intensive research on deep learning approaches over a decade, there is still a significant performance gap between WSSS and full semantic segmentation. Most current WSSS methods always focus on a limited single image (pixel-wise) information while ignoring the valuable inter-image (semantic-wise) information. From this perspective, a novel end-to-end WSSS framework called DSCNet is developed along with two innovations: i) pixel-wise group contrast and semantic-wise graph contrast are proposed and introduced into the WSSS framework; ii) a novel dual-stream contrastive learning (DSCL) mechanism is designed to jointly handle pixel-wise and semantic-wise context information for better WSSS performance. Specifically, the pixel-wise group contrast learning (PGCL) and semantic-wise graph contrast learning (SGCL) tasks form a more comprehensive solution. Extensive experiments on PASCAL VOC and MS COCO benchmarks verify the superiority of DSCNet over SOTA approaches and baseline models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4981–4990, 2018.
  2. Semi-supervised semantic segmentation with pixel-level contrastive learning from a class-wise memory bank. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8219–8228, 2021.
  3. A semantic similarity-based perspective of affect lexicons for sentiment analysis. Knowledge-Based Systems, 165:346–359, 2019.
  4. Single-stage semantic segmentation from image labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4253–4262, 2020.
  5. Weakly supervised deep detection networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2846–2854, 2016.
  6. Contrastive learning of global and local features for medical image segmentation with limited annotations. Advances in Neural Information Processing Systems, 33:12546–12558, 2020.
  7. Weakly-supervised semantic segmentation via sub-category exploration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8991–9000, 2020.
  8. Weakly supervised semantic segmentation with boundary exploration. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVI, pages 347–362, 2020.
  9. Self-supervised image-specific prototype exploration for weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4288–4298, 2022.
  10. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
  11. Attention-based dropout layer for weakly supervised single object localization and semantic segmentation. IEEE transactions on pattern analysis and machine intelligence, 43(12):4256–4271, 2020.
  12. Pearson correlation coefficient. Noise reduction in speech processing, pages 1–4, 2009.
  13. Deeply unsupervised patch re-identification for pre-training object detectors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
  14. Weakly supervised semantic segmentation by pixel-to-prototype contrast. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4320–4329, 2022.
  15. The pascal visual object classes (voc) challenge. International journal of computer vision, 88:303–308, 2009.
  16. Learning integral objects with intra-class discriminator for weakly-supervised semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4283–4292, 2020.
  17. Cian: Cross-image affinity net for weakly supervised semantic segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 10762–10769, 2020.
  18. Reliable mutual distillation for medical image segmentation under imperfect annotations. IEEE Transactions on Medical Imaging, 2023.
  19. Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33:21271–21284, 2020.
  20. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  21. Learning to segment every thing. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4233–4241, 2018.
  22. Integral object mining via online attention accumulation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 2070–2079, 2019.
  23. L2g: A simple local-to-global knowledge transfer framework for weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16886–16896, 2022.
  24. Towards better explanations of class activation mapping. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1336–1344, 2021.
  25. Contrastive learning for sports video: Unsupervised player classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4528–4536, 2021.
  26. Bbam: Bounding box attribution map for weakly supervised semantic and instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2643–2652, 2021.
  27. Railroad is not a train: Saliency as pseudo-pixel supervision for weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5495–5505, 2021.
  28. Pseudo-mask matters in weakly-supervised semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6964–6973, 2021.
  29. Scribblesup: Scribble-supervised convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3159–3167, 2016.
  30. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.
  31. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015.
  32. Learning saliency-free model with generic features for weakly-supervised semantic segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 11717–11724, 2020.
  33. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
  34. Learning affinity from attention: end-to-end weakly-supervised semantic segmentation with transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16846–16855, 2022.
  35. The graph neural network model. IEEE transactions on neural networks, 20(1):61–80, 2008.
  36. Mining cross-image semantics for weakly supervised semantic segmentation. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16, pages 347–365. Springer, 2020.
  37. Contrastive learning based hybrid networks for long-tailed image classification. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 943–952, 2021.
  38. Exploring cross-image pixel contrast for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7303–7313, 2021.
  39. Weakly-supervised semantic segmentation by iterative affinity learning. International Journal of Computer Vision, 128:1736–1749, 2020.
  40. Dense contrastive learning for self-supervised visual pre-training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3024–3033, 2021.
  41. Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12275–12284, 2020.
  42. Efficient outdoor video semantic segmentation using feedback-based fully convolution neural network. IEEE Transactions on Industrial Informatics, 16(8):5128–5136, 2019.
  43. Embedded discriminative attention mechanism for weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16765–16774, 2021.
  44. Detco: Unsupervised contrastive learning for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8392–8401, 2021.
  45. Leveraging auxiliary tasks with affinity learning for weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6984–6993, 2021.
  46. Multi-class token transformer for weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4310–4319, 2022.
  47. Research on k-value selection method of k-means clustering algorithm. J, 2(2):226–235, 2019.
  48. Wsod2: Learning bottom-up and top-down objectness distillation for weakly-supervised object detection. In Proceedings of the IEEE/CVF international conference on computer vision, pages 8292–8300, 2019.
  49. Generalized weakly supervised object localization. IEEE Transactions on Neural Networks and Learning Systems, 2022.
  50. Weakly supervised semantic segmentation via alternate self-dual teaching. IEEE Transactions on Image Processing, 2023.
  51. Causal intervention for weakly-supervised semantic segmentation. Advances in Neural Information Processing Systems, 33:655–666, 2020.
  52. Complementary patch for weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7242–7251, 2021.
  53. Pixel contrastive-consistent semi-supervised semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7273–7282, 2021.
  54. Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2921–2929, 2016.
  55. Group-wise learning for weakly supervised semantic segmentation. IEEE Transactions on Image Processing, 31:799–811, 2021.
  56. Regional semantic contrast and aggregation for weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4299–4309, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Qi Lai (4 papers)
  2. Chi-Man Vong (14 papers)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com