Self-training via Metric Learning for Source-Free Domain Adaptation of Semantic Segmentation (2212.04227v2)
Abstract: Unsupervised source-free domain adaptation methods aim to train a model for the target domain utilizing a pretrained source-domain model and unlabeled target-domain data, particularly when accessibility to source data is restricted due to intellectual property or privacy concerns. Traditional methods usually use self-training with pseudo-labeling, which is often subjected to thresholding based on prediction confidence. However, such thresholding limits the effectiveness of self-training due to insufficient supervision. This issue becomes more severe in a source-free setting, where supervision comes solely from the predictions of the pre-trained source model. In this study, we propose a novel approach by incorporating a mean-teacher model, wherein the student network is trained using all predictions from the teacher network. Instead of employing thresholding on predictions, we introduce a method to weight the gradients calculated from pseudo-labels based on the reliability of the teacher's predictions. To assess reliability, we introduce a novel approach using proxy-based metric learning. Our method is evaluated in synthetic-to-real and cross-city scenarios, demonstrating superior performance compared to existing state-of-the-art methods.
- Self-supervised augmentation consistency for adapting semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15384–15394.
- Ensemble deep manifold similarity learning using hard proxies, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7299–7307.
- Source-relaxed domain adaptation for image segmentation, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 490–499.
- Mixmatch: A holistic approach to semi-supervised learning, in: NeurIPS.
- Large-scale machine learning with stochastic gradient descent, in: Proceedings of COMPSTAT’2010. Springer, pp. 177–186.
- All about structure: Adapting structural information across domains for boosting semantic segmentation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1900–1909.
- Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence 40, 834–848.
- Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 .
- Domain adaptation for semantic segmentation with maximum squares loss, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 2090–2099.
- Learning semantic segmentation from synthetic data: A geometrically guided input-output adaptation approach, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1841–1850.
- Blazingly fast video object segmentation with pixel-wise metric learning, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1189–1198.
- Crdoco: Pixel-level domain transfer with cross-domain consistency, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1791–1800.
- No more discrimination: Cross city adaptation of road scene segmenters, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 1992–2001.
- Person re-identification by multi-channel parts-based cnn with improved triplet loss function, in: Proceedings of the iEEE conference on computer vision and pattern recognition, pp. 1335–1344.
- Self-ensembling with gan-based data augmentation for domain adaptation in semantic segmentation, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6830–6840.
- The cityscapes dataset for semantic urban scene understanding, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3213–3223.
- Ssf-dan: Separated semantic feature based domain adaptation network for semantic segmentation, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 982–991.
- The pascal visual object classes (voc) challenge. International journal of computer vision 88, 303–338.
- Rethinking bisenet for real-time semantic segmentation, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9716–9725.
- Uncertainty reduction for model adaptation in semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9613–9623.
- Semi-supervised semantic segmentation needs strong, varied perturbations. British Machine Vision Conference .
- Generative adversarial nets. Advances in neural information processing systems 27.
- Metacorrection: Domain-aware meta loss correction for unsupervised domain adaptation in semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3927–3936.
- Why relu networks yield high-confidence predictions far away from the training data and how to mitigate the problem, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 41–50.
- Energy-based self-training and normalization for unsupervised domain adaptation, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11653–11662.
- Deep clustering: Discriminative embeddings for segmentation and separation, in: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE. pp. 31–35.
- Conditional generative adversarial network for structured domain adaptation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1335–1344.
- mixup: Beyond empirical risk minimization. International Conference on Learning Representations URL: https://openreview.net/forum?id=r1Ddp1-Rb.
- Mic: Masked image consistency for context-enhanced domain adaptation, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11721–11732.
- Cross-domain image retrieval with a dual attribute-aware ranking network, in: Proceedings of the IEEE international conference on computer vision, pp. 1062–1070.
- Model adaptation: Historical contrastive learning for unsupervised domain adaptation without source data. Advances in Neural Information Processing Systems 34, 3635–3649.
- Mlsl: Multi-level self-supervised learning for domain adaptation with spatially independent and semantically consistent labeling, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1864–1873.
- C-sfda: A curriculum learning aided self-training framework for efficient source free domain adaptation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 24120–24131.
- Learning texture invariant representation for domain adaptation of semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12975–12984.
- Proxy anchor loss for deep metric learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3238–3247.
- Adam: A method for stochastic optimization, in: Bengio, Y., LeCun, Y. (Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings. URL: http://arxiv.org/abs/1412.6980.
- Generalize then adapt: Source-free domain adaptive semantic segmentation, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7046–7056.
- M-adda: Unsupervised domain adaptation with deep metric learning, in: Domain Adaptation for Visual Understanding. Springer, pp. 17–31.
- Bidirectional learning for domain adaptation of semantic segmentation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6936–6945.
- Cycle self-training for domain adaptation. Advances in Neural Information Processing Systems 34, 22968–22981.
- Source-free domain adaptation for semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1215–1224.
- Instance adaptive self-training for unsupervised domain adaptation. Proceedings of the European Conference on Computer Vision .
- No fuss distance metric learning using proxies, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 360–368.
- Deep metric learning via lifted structured feature embedding, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4004–4012.
- Classmix: Segmentation-based data augmentation for semi-supervised learning, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1369–1378.
- Unsupervised intra-domain adaptation for semantic segmentation through self-supervision, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3764–3773.
- Automatic differentiation in pytorch .
- Unsupervised domain adaptation with similarity learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8004–8013.
- Softtriple loss: Deep metric learning without triplet sampling, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6450–6458.
- Playing for data: Ground truth from computer games, in: Leibe, B., Matas, J., Sebe, N., Welling, M. (Eds.), Proceedings of the European Conference on Computer Vision, Springer International Publishing. pp. 102–118.
- The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes, in: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
- Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics 20, 53–65.
- Learning from synthetic data: Addressing domain shift for semantic segmentation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3752–3761.
- Facenet: A unified embedding for face recognition and clustering, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 815–823.
- Improved deep metric learning with multi-class n-pair loss objective, in: Advances in neural information processing systems, pp. 1857–1865.
- Learning from scale-invariant examples for domain adaptation in semantic segmentation. Proceedings of the European Conference on Computer Vision .
- Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. arXiv preprint arXiv:1703.01780 .
- Proxynca++: Revisiting and revitalizing proxy neighborhood component analysis, in: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIV 16, Springer. pp. 448–464.
- Unsupervised domain adaptation in semantic segmentation: a review. Technologies 8, 35.
- Dacs: Domain adaptation via cross-domain mixed sampling, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1379–1389.
- Learning to adapt structured output space for semantic segmentation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7472–7481.
- Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2517–2526.
- Deep high-resolution representation learning for visual recognition. IEEE transactions on pattern analysis and machine intelligence .
- Deep visual domain adaptation: A survey. Neurocomputing 312, 135–153.
- Multi-similarity loss with general pair weighting for deep metric learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5022–5030.
- Ranked list loss for deep metric learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5207–5216.
- Differential treatment for stuff and things: A simple unsupervised domain adaptation method for semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12635–12644.
- Self-training with noisy student improves imagenet classification, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10687–10698.
- Neutral cross-entropy loss based unsupervised domain adaptation for semantic segmentation. IEEE Transactions on Image Processing 30, 4516–4525.
- Pidnet: A real-time semantic segmentation network inspired by pid controllers, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 19529–19539.
- Source free domain adaptation for semantic segmentation via distribution transfer and adaptive class-balanced self-training, in: 2022 IEEE International Conference on Multimedia and Expo (ICME), IEEE. pp. 1–6.
- Fda: Fourier domain adaptation for semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4085–4095.
- Domain adaptive semantic segmentation without source data, in: Proceedings of the 29th ACM international conference on multimedia, pp. 3293–3302.
- Correcting the triplet selection bias for triplet loss, in: Proceedings of the European Conference on Computer Vision (ECCV), pp. 71–87.
- Cutmix: Regularization strategy to train strong classifiers with localizable features, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6023–6032.
- Category anchor-guided unsupervised domain adaptation for semantic segmentation. arXiv preprint arXiv:1910.13049 .
- Rectifying pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation. International Journal of Computer Vision 129, 1106–1120.
- Squeeze-and-attention networks for semantic segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13065–13074.
- Affinity space adaptation for semantic segmentation across domains. IEEE Transactions on Image Processing 30, 2549–2561.
- Confidence regularized self-training, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 5982–5991.
- Unsupervised domain adaptation for semantic segmentation via class-balanced self-training, in: Proceedings of the European conference on computer vision (ECCV), pp. 289–305.
- Ibrahim Batuhan Akkaya (5 papers)
- Ugur Halici (7 papers)