Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised Semantic Segmentation (2403.01156v1)

Published 2 Mar 2024 in cs.CV

Abstract: Most existing weakly supervised semantic segmentation (WSSS) methods rely on Class Activation Mapping (CAM) to extract coarse class-specific localization maps using image-level labels. Prior works have commonly used an off-line heuristic thresholding process that combines the CAM maps with off-the-shelf saliency maps produced by a general pre-trained saliency model to produce more accurate pseudo-segmentation labels. We propose AuxSegNet+, a weakly supervised auxiliary learning framework to explore the rich information from these saliency maps and the significant inter-task correlation between saliency detection and semantic segmentation. In the proposed AuxSegNet+, saliency detection and multi-label image classification are used as auxiliary tasks to improve the primary task of semantic segmentation with only image-level ground-truth labels. We also propose a cross-task affinity learning mechanism to learn pixel-level affinities from the saliency and segmentation feature maps. In particular, we propose a cross-task dual-affinity learning module to learn both pairwise and unary affinities, which are used to enhance the task-specific features and predictions by aggregating both query-dependent and query-independent global context for both saliency detection and semantic segmentation. The learned cross-task pairwise affinity can also be used to refine and propagate CAM maps to provide better pseudo labels for both tasks. Iterative improvement of segmentation performance is enabled by cross-task affinity learning and pseudo-label updating. Extensive experiments demonstrate the effectiveness of the proposed approach with new state-of-the-art WSSS results on the challenging PASCAL VOC and MS COCO benchmarks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (82)
  1. L. Xu, W. Ouyang, M. Bennamoun, F. Boussaid, F. Sohel, and D. Xu, “Leveraging auxiliary tasks with affinity learning for weakly supervised semantic segmentation,” in Int. Conf. Comput. Vis., 2021, pp. 6984–6993.
  2. G. Wang, G. Wang, X. Zhang, J. Lai, Z. Yu, and L. Lin, “Weakly supervised person re-id: Differentiable graphical learning and a new benchmark,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 5, pp. 2142–2156, 2020.
  3. X.-Y. Zhang, C. Li, H. Shi, X. Zhu, P. Li, and J. Dong, “Adapnet: Adaptability decomposing encoder-decoder network for weakly supervised action recognition and localization,” IEEE Trans. Neural Netw. Learn. Syst., early access, 23 Jan. 2020, doi:10.1109/TNNLS.2019.2962815.
  4. Y. Yao, F. Wan, W. Gao, X. Pan, Z. Peng, Q. Tian, and Q. Ye, “Ts-cam: Token semantic coupled attention map for weakly supervised object localization,” IEEE Trans. Neural Netw. Learn. Syst., pp. 1–13, early access, 23 Nov. 2022, doi:10.1109/TNNLS.2022.3218471.
  5. D. Zhang, G. Guo, W. Zeng, L. Li, and J. Han, “Generalized weakly supervised object localization,” IEEE Trans. Neural Netw. Learn. Syst., pp. 1–12, early access, 21 Sept. 2022, doi:10.1109/TNNLS.2022.3204337.
  6. Y. Shen, R. Ji, C. Wang, X. Li, and X. Li, “Weakly supervised object detection via object-specific pixel gradient,” IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 12, pp. 5960–5970, 2018.
  7. D. Zhang, J. Han, L. Zhao, and T. Zhao, “From discriminant to complete: Reinforcement searching-agent learning for weakly supervised object detection,” IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 12, pp. 5549–5560, 2020.
  8. Z. Wu, J. Wen, Y. Xu, J. Yang, X. Li, and D. Zhang, “Enhanced spatial feature learning for weakly supervised object detection,” IEEE Trans. Neural Netw. Learn. Syst., pp. 1–12, early access, 08 Jun. 2022, 10.1109/TNNLS.2022.3178180.
  9. R. Hu, P. Dollár, K. He, T. Darrell, and R. Girshick, “Learning to segment every thing,” in IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 4233–4241.
  10. C. Song, Y. Huang, W. Ouyang, and L. Wang, “Box-driven class-wise region masking and filling rate guided loss for weakly supervised semantic segmentation,” in IEEE Conf. Comput. Vis. Pattern Recog., 2019, pp. 3136–3145.
  11. D. Lin, J. Dai, J. Jia, K. He, and J. Sun, “Scribblesup: Scribble-supervised convolutional networks for semantic segmentation,” in IEEE Conf. Comput. Vis. Pattern Recog., 2016, pp. 3159–3167.
  12. M. Tang, A. Djelouah, F. Perazzi, Y. Boykov, and C. Schroers, “Normalized cut loss for weakly-supervised cnn segmentation,” in IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 1818–1827.
  13. D. Pathak, P. Krahenbuhl, and T. Darrell, “Constrained convolutional neural networks for weakly supervised segmentation,” in Int. Conf. Comput. Vis., 2015, pp. 1796–1804.
  14. A. Kolesnikov and C. H. Lampert, “Seed, expand and constrain: Three principles for weakly-supervised image segmentation,” in Eur. Conf. Comput. Vis., 2016, pp. 695–711.
  15. L. Xu, H. Xue, M. Bennamoun, F. Boussaid, and F. Sohel, “Atrous convolutional feature network for weakly supervised semantic segmentation,” Neurocomputing, vol. 421, pp. 115–126, 2021.
  16. Z. Zhang, Q. Peng, S. Fu, W. Wang, Y.-M. Cheung, Y. Zhao, S. Yu, and X. You, “A componentwise approach to weakly supervised semantic segmentation using dual-feedback network,” IEEE Trans. Neural Netw. Learn. Syst., pp. 1–14, early access, 04 Feb. 2022 2022, doi:10.1109/TNNLS.2022.3144194.
  17. B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba, “Learning deep features for discriminative localization,” in IEEE Conf. Comput. Vis. Pattern Recog., 2016, pp. 2921–2929.
  18. Y. Wei, J. Feng, X. Liang, M.-M. Cheng, Y. Zhao, and S. Yan, “Object region mining with adversarial erasing: A simple classification to semantic segmentation approach,” in IEEE Conf. Comput. Vis. Pattern Recog., 2017, pp. 1568–1576.
  19. Y. Wei, H. Xiao, H. Shi, Z. Jie, J. Feng, and T. S. Huang, “Revisiting dilated convolution: A simple approach for weakly-and semi-supervised semantic segmentation,” in IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 7268–7277.
  20. P.-T. Jiang, Q. Hou, Y. Cao, M.-M. Cheng, Y. Wei, and H.-K. Xiong, “Integral object mining via online attention accumulation,” in Int. Conf. Comput. Vis., 2019, pp. 2070–2079.
  21. Y. Wang, J. Zhang, M. Kan, S. Shan, and X. Chen, “Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation,” in IEEE Conf. Comput. Vis. Pattern Recog., 2020, pp. 12 275–12 284.
  22. A. Chaudhry, P. K. Dokania, and P. H. Torr, “Discovering class-specific pixels for weakly-supervised semantic segmentation,” in Brit. Mach. Vis. Conf., 2017, pp. 1–17.
  23. Q. Hou, P. Jiang, Y. Wei, and M.-M. Cheng, “Self-erasing network for integral object attention,” in Adv. Neural Inform. Process. Syst., 2018, pp. 547–557.
  24. G. Sun, W. Wang, J. Dai, and L. Van Gool, “Mining cross-image semantics for weakly supervised semantic segmentation,” in Eur. Conf. Comput. Vis., 2020, pp. 347–365.
  25. T. Zhang, G. Lin, W. Liu, J. Cai, and A. Kot, “Splitting vs. merging: Mining object regions with discrepancy and intersection loss for weakly supervised semantic segmentation,” in Eur. Conf. Comput. Vis., 2020, pp. 663–679.
  26. S. Jiang, J. Li, Y. Wang, W. Wu, J. Zhang, B. Huang, and T. Xu, “Metaseg: Content-aware meta-net for omni-supervised semantic segmentation,” IEEE Trans. Neural Networks Learn. Syst. (Early Access), pp. 1–13, 2023, doi: 10.1109/TNNLS.2023.3263335.
  27. M. Tang, F. Perazzi, A. Djelouah, I. Ben Ayed, C. Schroers, and Y. Boykov, “On regularized losses for weakly-supervised cnn segmentation,” in Eur. Conf. Comput. Vis., 2018, pp. 507–522.
  28. T. Ke, J. Hwang, and S. X. Yu, “Universal weakly supervised segmentation by pixel-to-segment contrastive learning,” in Int. Conf. Learn. Represent., 2021.
  29. K. Li, Z. Wu, K.-C. Peng, J. Ernst, and Y. Fu, “Tell me where to look: Guided attention inference network,” in IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 9215–9223.
  30. H. Kweon, S.-H. Yoon, H. Kim, D. Park, and K.-J. Yoon, “Unlocking the potential of ordinary classifier: Class-specific adversarial erasing framework for weakly supervised semantic segmentation,” in Int. Conf. Comput. Vis., 2021, pp. 6994–7003.
  31. J. Lee, E. Kim, and S. Yoon, “Anti-adversarially manipulated attributions for weakly and semi-supervised semantic segmentation,” in IEEE Conf. Comput. Vis. Pattern Recog., 2021, pp. 4071–4080.
  32. D. Zhang, H. Zhang, J. Tang, X. Hua, and Q. Sun, “Causal intervention for weakly-supervised semantic segmentation,” in Adv. Neural Inform. Process. Syst., vol. 33, 2020, pp. 655–666.
  33. Y. Su, R. Sun, G. Lin, and Q. Wu, “Context decoupling augmentation for weakly supervised semantic segmentation,” in Int. Conf. Comput. Vis., 2021, pp. 7004–7014.
  34. L. Xu, M. Bennamoun, F. Boussaid, and F. Sohel, “Scale-aware feature network for weakly supervised semantic segmentation,” IEEE Access, vol. 8, pp. 75 957–75 967, 2020.
  35. Y. Yao, T. Chen, G.-S. Xie, C. Zhang, F. Shen, Q. Wu, Z. Tang, and J. Zhang, “Non-salient region object mining for weakly supervised semantic segmentation,” in IEEE Conf. Comput. Vis. Pattern Recog., 2021, pp. 2623–2632.
  36. X. Li, T. Zhou, J. Li, Y. Zhou, and Z. Zhang, “Group-wise semantic mining for weakly supervised semantic segmentation,” in Proc. of AAAI Conference on Artificial Intelligence, vol. 35, no. 3, 2021, pp. 1984–1992.
  37. C. Wang, D. Zhang, L. Zhang, and J. Tang, “Coupling global context and local contents for weakly-supervised semantic segmentation,” IEEE Trans. Neural Netw. Learn. Syst., 2023.
  38. Y.-T. Chang, Q. Wang, W.-C. Hung, R. Piramuthu, Y.-H. Tsai, and M.-H. Yang, “Weakly-supervised semantic segmentation via sub-category exploration,” in IEEE Conf. Comput. Vis. Pattern Recog., 2020, pp. 8991–9000.
  39. F. Zhang, C. Gu, C. Zhang, and Y. Dai, “Complementary patch for weakly supervised semantic segmentation,” in Int. Conf. Comput. Vis., 2021, pp. 7242–7251.
  40. Z. Huang, X. Wang, J. Wang, W. Liu, and J. Wang, “Weakly-supervised semantic segmentation network with deep seeded region growing,” in IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 7014–7023.
  41. X. Wang, S. You, X. Li, and H. Ma, “Weakly-supervised semantic segmentation by iteratively mining common object features,” in IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 1354–1362.
  42. J. Ahn and S. Kwak, “Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation,” in IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 4981–4990.
  43. J. Fan, Z. Zhang, T. Tan, C. Song, and J. Xiao, “Cian: Cross-image affinity net for weakly supervised semantic segmentation,” in Proc. of AAAI Conference on Artificial Intelligence, vol. 34, no. 07, 2020, pp. 10 762–10 769.
  44. D. Xu, W. Ouyang, X. Wang, and N. Sebe, “Pad-net: Multi-tasks guided prediction-and-distillation network for simultaneous depth estimation and scene parsing,” in IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 675–684.
  45. L. Sheng, D. Xu, W. Ouyang, and X. Wang, “Unsupervised collaborative learning of keyframe detection and visual odometry towards monocular deep slam,” in Int. Conf. Comput. Vis., 2019, pp. 4302–4311.
  46. D. Xu, A. Vedaldi, and J. F. Henriques, “Moving slam: Fully unsupervised deep learning in non-rigid scenes,” in IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), 2021, pp. 4611–4617.
  47. S. Liu, A. Davison, and E. Johns, “Self-supervised generalisation with meta auxiliary learning,” in Adv. Neural Inform. Process. Syst., 2019, pp. 1677–1687.
  48. J. Dai, K. He, and J. Sun, “Instance-aware semantic segmentation via multi-task network cascades,” in IEEE Conf. Comput. Vis. Pattern Recog., 2016, pp. 3150–3158.
  49. H. Chen, X. Qi, L. Yu, and P.-A. Heng, “Dcan: deep contour-aware networks for accurate gland segmentation,” in IEEE Conf. Comput. Vis. Pattern Recog., 2016, pp. 2487–2496.
  50. Y. Shen, R. Ji, Y. Wang, Y. Wu, and L. Cao, “Cyclic guidance for weakly supervised joint detection and segmentation,” in IEEE Conf. Comput. Vis. Pattern Recog., 2019, pp. 697–707.
  51. J. Hwang, S. Kim, J. Son, and B. Han, “Weakly supervised instance segmentation by deep community learning,” in IEEE Wint. Conf. App. Comput. Vis., 2021, pp. 1020–1029.
  52. B. Zhang, J. Xiao, Y. Wei, M. Sun, and K. Huang, “Reliability does matter: An end-to-end weakly supervised semantic segmentation approach,” in Proc. of AAAI Conference on Artificial Intelligence, vol. 34, no. 07, 2020, pp. 12 765–12 772.
  53. N. Araslanov and S. Roth, “Single-stage semantic segmentation from image labels,” in IEEE Conf. Comput. Vis. Pattern Recog., 2020, pp. 4253–4262.
  54. Y. Zeng, Y. Zhuge, H. Lu, and L. Zhang, “Joint learning of saliency detection and weakly supervised semantic segmentation,” in Int. Conf. Comput. Vis., 2019, pp. 7223–7233.
  55. S. Lee, M. Lee, J. Lee, and H. Shim, “Railroad is not a train: Saliency as pseudo-pixel supervision for weakly supervised semantic segmentation,” in IEEE Conf. Comput. Vis. Pattern Recog., 2021, pp. 5495–5505.
  56. Q. Hou, M. Cheng, X. Hu, A. Borji, Z. Tu, and P. Torr, “Deeply supervised salient object detection with short connections.” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 4, pp. 815–828, 2019.
  57. J. Lee, E. Kim, S. Lee, J. Lee, and S. Yoon, “Ficklenet: Weakly and semi-supervised segmentation using stochastic inference,” in IEEE Conf. Comput. Vis. Pattern Recog., 2019, pp. 5267–5276.
  58. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Adv. Neural Inform. Process. Syst., 2017, pp. 5998–6008.
  59. Y. Cao, J. Xu, S. Lin, F. Wei, and H. Hu, “Gcnet: Non-local networks meet squeeze-excitation networks and beyond,” in Int. Conf. Comput. Vis. Worksh., 2019, pp. 1–10.
  60. M. Yin, Z. Yao, Y. Cao, X. Li, Z. Zhang, S. Lin, and H. Hu, “Disentangled non-local neural networks,” in Eur. Conf. Comput. Vis., 2020, pp. 191–207.
  61. J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 7132–7141.
  62. S. Woo, J. Park, J.-Y. Lee, and I. So Kweon, “Cbam: Convolutional block attention module,” in Eur. Conf. Comput. Vis., 2018, pp. 3–19.
  63. S. Zhang, J. Yang, and B. Schiele, “Occluded pedestrian detection through guided attention in cnns,” in IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 6995–7003.
  64. H. Zheng, J. Fu, T. Mei, and J. Luo, “Learning multi-attention convolutional neural network for fine-grained image recognition,” in Int. Conf. Comput. Vis., 2017, pp. 5209–5217.
  65. M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman, “The pascal visual object classes (voc) challenge,” Int. J. Comput. Vis., vol. 88, no. 2, pp. 303–338, 2010.
  66. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in Eur. Conf. Comput. Vis., 2014, pp. 740–755.
  67. L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Semantic image segmentation with deep convolutional nets and fully connected crfs,” in Int. Conf. Learn. Represent., 2015.
  68. B. Hariharan, P. Arbeláez, L. Bourdev, S. Maji, and J. Malik, “Semantic contours from inverse detectors,” in Int. Conf. Comput. Vis., 2011, pp. 991–998.
  69. W. Luo and M. Yang, “Learning saliency-free model with generic features for weakly-supervised semantic segmentation.” in Proc. of AAAI Conference on Artificial Intelligence, vol. 34, no. 07, 2020, pp. 11 717–11 724.
  70. K. Sun, H. Shi, Z. Zhang, and Y. Huang, “Ecs-net: Improving weakly supervised semantic segmentation by using connections between class activation maps,” in Int. Conf. Comput. Vis., 2021, pp. 7283–7292.
  71. Y. Li, Z. Kuang, L. Liu, Y. Chen, and W. Zhang, “Pseudo-mask matters in weakly-supervised semantic segmentation,” in Int. Conf. Comput. Vis., 2021, pp. 6964–6973.
  72. J. Qin, J. Wu, X. Xiao, L. Li, and X. Wang, “Activation modulation and recalibration scheme for weakly supervised semantic segmentation,” in Proc. of AAAI Conference on Artificial Intelligence, vol. 36, no. 2, 2022, pp. 2117–2125.
  73. J. Fan, Z. Zhang, C. Song, and T. Tan, “Learning integral objects with intra-class discriminator for weakly-supervised semantic segmentation,” in IEEE Conf. Comput. Vis. Pattern Recog., 2020, pp. 4283–4292.
  74. T. Wu, J. Huang, G. Gao, X. Wei, X. Wei, X. Luo, and C. H. Liu, “Embedded discriminative attention mechanism for weakly supervised semantic segmentation,” in IEEE Conf. Comput. Vis. Pattern Recog., 2021, pp. 16 765–16 774.
  75. Z. Wu, C. Shen, and A. Van Den Hengel, “Wider or deeper: Revisiting the resnet model for visual recognition,” Pattern Recognition, vol. 90, pp. 119–133, 2019.
  76. J. Zhang, X. Yu, A. Li, P. Song, B. Liu, and Y. Dai, “Weakly-supervised salient object detection via scribble annotations,” in IEEE Conf. Comput. Vis. Pattern Recog., 2020, pp. 12 546–12 555.
  77. J.-J. Liu, Q. Hou, M.-M. Cheng, J. Feng, and J. Jiang, “A simple pooling-based design for real-time salient object detection,” in IEEE Conf. Comput. Vis. Pattern Recog., 2019, pp. 3917–3926.
  78. Y. Pang, X. Zhao, L. Zhang, and H. Lu, “Multi-scale interactive network for salient object detection,” in IEEE Conf. Comput. Vis. Pattern Recog., 2020, pp. 9413–9422.
  79. J. Ahn, S. Cho, and S. Kwak, “Weakly supervised learning of instance segmentation with inter-pixel relations,” in IEEE Conf. Comput. Vis. Pattern Recog., 2019, pp. 2209–2218.
  80. P. Wang and X. Bai, “Thermal infrared pedestrian segmentation based on conditional gan,” IEEE Trans. Image Process., vol. 28, no. 12, pp. 6007–6021, 2019.
  81. X. Bai, P. Wang, and F. Zhou, “Pedestrian segmentation in infrared images based on circular shortest path,” IEEE Transactions on Intelligent Transportation Systems, vol. 17, no. 8, pp. 2214–2222, 2016.
  82. Y.-W. Chao, Y. Liu, X. Liu, H. Zeng, and J. Deng, “Learning to detect human-object interactions,” in IEEE Wint. Conf. App. Comput. Vis.   IEEE, 2018, pp. 381–389.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Lian Xu (6 papers)
  2. Mohammed Bennamoun (124 papers)
  3. Farid Boussaid (30 papers)
  4. Wanli Ouyang (358 papers)
  5. Ferdous Sohel (35 papers)
  6. Dan Xu (120 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.