Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Self-training via Metric Learning for Source-Free Domain Adaptation of Semantic Segmentation (2212.04227v2)

Published 8 Dec 2022 in cs.CV and cs.LG

Abstract: Unsupervised source-free domain adaptation methods aim to train a model for the target domain utilizing a pretrained source-domain model and unlabeled target-domain data, particularly when accessibility to source data is restricted due to intellectual property or privacy concerns. Traditional methods usually use self-training with pseudo-labeling, which is often subjected to thresholding based on prediction confidence. However, such thresholding limits the effectiveness of self-training due to insufficient supervision. This issue becomes more severe in a source-free setting, where supervision comes solely from the predictions of the pre-trained source model. In this study, we propose a novel approach by incorporating a mean-teacher model, wherein the student network is trained using all predictions from the teacher network. Instead of employing thresholding on predictions, we introduce a method to weight the gradients calculated from pseudo-labels based on the reliability of the teacher's predictions. To assess reliability, we introduce a novel approach using proxy-based metric learning. Our method is evaluated in synthetic-to-real and cross-city scenarios, demonstrating superior performance compared to existing state-of-the-art methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (81)
  1. Self-supervised augmentation consistency for adapting semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15384–15394.
  2. Ensemble deep manifold similarity learning using hard proxies, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7299–7307.
  3. Source-relaxed domain adaptation for image segmentation, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 490–499.
  4. Mixmatch: A holistic approach to semi-supervised learning, in: NeurIPS.
  5. Large-scale machine learning with stochastic gradient descent, in: Proceedings of COMPSTAT’2010. Springer, pp. 177–186.
  6. All about structure: Adapting structural information across domains for boosting semantic segmentation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1900–1909.
  7. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence 40, 834–848.
  8. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 .
  9. Domain adaptation for semantic segmentation with maximum squares loss, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 2090–2099.
  10. Learning semantic segmentation from synthetic data: A geometrically guided input-output adaptation approach, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1841–1850.
  11. Blazingly fast video object segmentation with pixel-wise metric learning, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1189–1198.
  12. Crdoco: Pixel-level domain transfer with cross-domain consistency, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1791–1800.
  13. No more discrimination: Cross city adaptation of road scene segmenters, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 1992–2001.
  14. Person re-identification by multi-channel parts-based cnn with improved triplet loss function, in: Proceedings of the iEEE conference on computer vision and pattern recognition, pp. 1335–1344.
  15. Self-ensembling with gan-based data augmentation for domain adaptation in semantic segmentation, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6830–6840.
  16. The cityscapes dataset for semantic urban scene understanding, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3213–3223.
  17. Ssf-dan: Separated semantic feature based domain adaptation network for semantic segmentation, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 982–991.
  18. The pascal visual object classes (voc) challenge. International journal of computer vision 88, 303–338.
  19. Rethinking bisenet for real-time semantic segmentation, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9716–9725.
  20. Uncertainty reduction for model adaptation in semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9613–9623.
  21. Semi-supervised semantic segmentation needs strong, varied perturbations. British Machine Vision Conference .
  22. Generative adversarial nets. Advances in neural information processing systems 27.
  23. Metacorrection: Domain-aware meta loss correction for unsupervised domain adaptation in semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3927–3936.
  24. Why relu networks yield high-confidence predictions far away from the training data and how to mitigate the problem, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 41–50.
  25. Energy-based self-training and normalization for unsupervised domain adaptation, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11653–11662.
  26. Deep clustering: Discriminative embeddings for segmentation and separation, in: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE. pp. 31–35.
  27. Conditional generative adversarial network for structured domain adaptation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1335–1344.
  28. mixup: Beyond empirical risk minimization. International Conference on Learning Representations URL: https://openreview.net/forum?id=r1Ddp1-Rb.
  29. Mic: Masked image consistency for context-enhanced domain adaptation, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11721–11732.
  30. Cross-domain image retrieval with a dual attribute-aware ranking network, in: Proceedings of the IEEE international conference on computer vision, pp. 1062–1070.
  31. Model adaptation: Historical contrastive learning for unsupervised domain adaptation without source data. Advances in Neural Information Processing Systems 34, 3635–3649.
  32. Mlsl: Multi-level self-supervised learning for domain adaptation with spatially independent and semantically consistent labeling, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1864–1873.
  33. C-sfda: A curriculum learning aided self-training framework for efficient source free domain adaptation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 24120–24131.
  34. Learning texture invariant representation for domain adaptation of semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12975–12984.
  35. Proxy anchor loss for deep metric learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3238–3247.
  36. Adam: A method for stochastic optimization, in: Bengio, Y., LeCun, Y. (Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings. URL: http://arxiv.org/abs/1412.6980.
  37. Generalize then adapt: Source-free domain adaptive semantic segmentation, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7046–7056.
  38. M-adda: Unsupervised domain adaptation with deep metric learning, in: Domain Adaptation for Visual Understanding. Springer, pp. 17–31.
  39. Bidirectional learning for domain adaptation of semantic segmentation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6936–6945.
  40. Cycle self-training for domain adaptation. Advances in Neural Information Processing Systems 34, 22968–22981.
  41. Source-free domain adaptation for semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1215–1224.
  42. Instance adaptive self-training for unsupervised domain adaptation. Proceedings of the European Conference on Computer Vision .
  43. No fuss distance metric learning using proxies, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 360–368.
  44. Deep metric learning via lifted structured feature embedding, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4004–4012.
  45. Classmix: Segmentation-based data augmentation for semi-supervised learning, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1369–1378.
  46. Unsupervised intra-domain adaptation for semantic segmentation through self-supervision, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3764–3773.
  47. Automatic differentiation in pytorch .
  48. Unsupervised domain adaptation with similarity learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8004–8013.
  49. Softtriple loss: Deep metric learning without triplet sampling, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6450–6458.
  50. Playing for data: Ground truth from computer games, in: Leibe, B., Matas, J., Sebe, N., Welling, M. (Eds.), Proceedings of the European Conference on Computer Vision, Springer International Publishing. pp. 102–118.
  51. The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes, in: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
  52. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics 20, 53–65.
  53. Learning from synthetic data: Addressing domain shift for semantic segmentation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3752–3761.
  54. Facenet: A unified embedding for face recognition and clustering, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 815–823.
  55. Improved deep metric learning with multi-class n-pair loss objective, in: Advances in neural information processing systems, pp. 1857–1865.
  56. Learning from scale-invariant examples for domain adaptation in semantic segmentation. Proceedings of the European Conference on Computer Vision .
  57. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. arXiv preprint arXiv:1703.01780 .
  58. Proxynca++: Revisiting and revitalizing proxy neighborhood component analysis, in: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIV 16, Springer. pp. 448–464.
  59. Unsupervised domain adaptation in semantic segmentation: a review. Technologies 8, 35.
  60. Dacs: Domain adaptation via cross-domain mixed sampling, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1379–1389.
  61. Learning to adapt structured output space for semantic segmentation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7472–7481.
  62. Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2517–2526.
  63. Deep high-resolution representation learning for visual recognition. IEEE transactions on pattern analysis and machine intelligence .
  64. Deep visual domain adaptation: A survey. Neurocomputing 312, 135–153.
  65. Multi-similarity loss with general pair weighting for deep metric learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5022–5030.
  66. Ranked list loss for deep metric learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5207–5216.
  67. Differential treatment for stuff and things: A simple unsupervised domain adaptation method for semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12635–12644.
  68. Self-training with noisy student improves imagenet classification, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10687–10698.
  69. Neutral cross-entropy loss based unsupervised domain adaptation for semantic segmentation. IEEE Transactions on Image Processing 30, 4516–4525.
  70. Pidnet: A real-time semantic segmentation network inspired by pid controllers, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 19529–19539.
  71. Source free domain adaptation for semantic segmentation via distribution transfer and adaptive class-balanced self-training, in: 2022 IEEE International Conference on Multimedia and Expo (ICME), IEEE. pp. 1–6.
  72. Fda: Fourier domain adaptation for semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4085–4095.
  73. Domain adaptive semantic segmentation without source data, in: Proceedings of the 29th ACM international conference on multimedia, pp. 3293–3302.
  74. Correcting the triplet selection bias for triplet loss, in: Proceedings of the European Conference on Computer Vision (ECCV), pp. 71–87.
  75. Cutmix: Regularization strategy to train strong classifiers with localizable features, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6023–6032.
  76. Category anchor-guided unsupervised domain adaptation for semantic segmentation. arXiv preprint arXiv:1910.13049 .
  77. Rectifying pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation. International Journal of Computer Vision 129, 1106–1120.
  78. Squeeze-and-attention networks for semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13065–13074.
  79. Affinity space adaptation for semantic segmentation across domains. IEEE Transactions on Image Processing 30, 2549–2561.
  80. Confidence regularized self-training, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 5982–5991.
  81. Unsupervised domain adaptation for semantic segmentation via class-balanced self-training, in: Proceedings of the European conference on computer vision (ECCV), pp. 289–305.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Ibrahim Batuhan Akkaya (5 papers)
  2. Ugur Halici (7 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.