Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Adapting Self-Supervised Representations to Multi-Domain Setups (2309.03999v2)

Published 7 Sep 2023 in cs.CV and cs.LG

Abstract: Current state-of-the-art self-supervised approaches, are effective when trained on individual domains but show limited generalization on unseen domains. We observe that these models poorly generalize even when trained on a mixture of domains, making them unsuitable to be deployed under diverse real-world setups. We therefore propose a general-purpose, lightweight Domain Disentanglement Module (DDM) that can be plugged into any self-supervised encoder to effectively perform representation learning on multiple, diverse domains with or without shared classes. During pre-training according to a self-supervised loss, DDM enforces a disentanglement in the representation space by splitting it into a domain-variant and a domain-invariant portion. When domain labels are not available, DDM uses a robust clustering approach to discover pseudo-domains. We show that pre-training with DDM can show up to 3.5% improvement in linear probing accuracy on state-of-the-art self-supervised models including SimCLR, MoCo, BYOL, DINO, SimSiam and Barlow Twins on multi-domain benchmarks including PACS, DomainNet and WILDS. Models trained with DDM show significantly improved generalization (7.4%) to unseen domains compared to baselines. Therefore, DDM can efficiently adapt self-supervised encoders to provide high-quality, generalizable representations for diverse multi-domain data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. Wasserstein gan, 2017.
  2. Learning representations by maximizing mutual information across views. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper/2019/file/ddf354219aac374f1d40b7e760ee5bb7-Paper.pdf.
  3. Vicreg: Variance-invariance-covariance regularization for self-supervised learning, 2021.
  4. Cliquecnn: Deep unsupervised exemplar learning. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc., 2016. URL https://proceedings.neurips.cc/paper/2016/file/65fc52ed8f88c81323a418ca94cec2ed-Paper.pdf.
  5. Unsupervised learning by predicting noise. In Doina Precup and Yee Whye Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 517–526. PMLR, 06–11 Aug 2017. URL http://proceedings.mlr.press/v70/bojanowski17a.html.
  6. Deep clustering for unsupervised learning of visual features. In Proceedings of the European Conference on Computer Vision (ECCV), September 2018a.
  7. Deep clustering for unsupervised learning of visual features. Lecture Notes in Computer Science, page 139–156, 2018b. ISSN 1611-3349. 10.1007/978-3-030-01264-9_9. URL http://dx.doi.org/10.1007/978-3-030-01264-9_9.
  8. Unsupervised pre-training of image features on non-curated data. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019.
  9. Unsupervised learning of visual features by contrasting cluster assignments. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 9912–9924. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper/2020/file/70feb62b69f16e0238f741fab228fec2-Paper.pdf.
  10. Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 9650–9660, October 2021.
  11. A simple framework for contrastive learning of visual representations. In Hal Daumé III and Aarti Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 1597–1607. PMLR, 13–18 Jul 2020a. URL http://proceedings.mlr.press/v119/chen20j.html.
  12. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15750–15758, June 2021.
  13. Improved baselines with momentum contrastive learning, 2020b.
  14. Stanford stl-10 image dataset. URL https://cs.stanford.edu/~acoates/stl10/.
  15. Discriminative unsupervised feature learning with convolutional neural networks. In Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 27. Curran Associates, Inc., 2014. URL https://proceedings.neurips.cc/paper/2014/file/07563a3fe3bbe7e3ba84431ad9d055af-Paper.pdf.
  16. An image is worth 16x16 words: Transformers for image recognition at scale, 2021.
  17. Self-supervised representation learning from multi-domain data. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 3244–3254, 2019.
  18. Bootstrap your own latent - a new approach to self-supervised learning. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 21271–21284. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper/2020/file/f3ada80d5c4ee70142b17b8192b2958e-Paper.pdf.
  19. Improved training of wasserstein gans, 2017.
  20. A k-means clustering algorithm. JSTOR: Applied Statistics, 28(1):100–108, 1979.
  21. Momentum contrast for unsupervised visual representation learning. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9726–9735, 2020. 10.1109/CVPR42600.2020.00975.
  22. Masked autoencoders are scalable vision learners. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2022. 10.1109/cvpr52688.2022.01553. URL http://dx.doi.org/10.1109/CVPR52688.2022.01553.
  23. Unsupervised deep learning by neighbourhood discovery. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 2849–2858. PMLR, 09–15 Jun 2019. URL http://proceedings.mlr.press/v97/huang19b.html.
  24. Understanding dimensional collapse in contrastive self-supervised learning. ArXiv, abs/2110.09348, 2021.
  25. Measuring self-supervised representation quality for downstream classification using discriminative features, 2022.
  26. Invariant learning via diffusion dreamed distribution shifts, 2022.
  27. Supervised contrastive learning, 2020.
  28. Cross-domain self-supervised learning for domain adaptation with few source labels, 2020.
  29. Cds: Cross-domain self-supervised pre-training. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 9123–9132, October 2021.
  30. Wilds: A benchmark of in-the-wild distribution shifts, 2021.
  31. Revisiting self-supervised visual representation learning. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1920–1929, 2019. 10.1109/CVPR.2019.00202.
  32. Cifar-10 (canadian institute for advanced research). a. URL http://www.cs.toronto.edu/~kriz/cifar.html.
  33. Cifar-100 (canadian institute for advanced research). b. URL http://www.cs.toronto.edu/~kriz/cifar.html.
  34. Ya Le and X. Yang. Tiny imagenet visual recognition challenge. 2015.
  35. Invariant information bottleneck for domain generalization, 2021.
  36. Deeper, broader and artier domain generalization. 2017 IEEE International Conference on Computer Vision (ICCV), Oct 2017. 10.1109/iccv.2017.591. URL http://dx.doi.org/10.1109/ICCV.2017.591.
  37. Moment matching for multi-source domain adaptation. In Proceedings of the IEEE International Conference on Computer Vision, pages 1406–1415, 2019.
  38. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015. 10.1007/s11263-015-0816-y.
  39. Wasserstein distance guided representation learning for domain adaptation, 2017.
  40. Salient imagenet: How to discover spurious features in deep learning?, 2021.
  41. Opencon: Open-world contrastive learning, 2022.
  42. Representation learning with contrastive predictive coding, 2019.
  43. Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of Machine Learning Research, 9(86):2579–2605, 2008. URL http://jmlr.org/papers/v9/vandermaaten08a.html.
  44. Extending and analyzing self-supervised learning across domains. In Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm, editors, Computer Vision – ECCV 2020, pages 717–734, Cham, 2020. Springer International Publishing. ISBN 978-3-030-58574-7.
  45. Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
  46. Domain invariant masked autoencoders for self-supervised learning from multi-domains, 2022.
  47. Self-labelling via simultaneous clustering and representation learning. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=Hyx-jyBFPr.
  48. Barlow twins: Self-supervised learning via redundancy reduction, 2021.
  49. Towards unsupervised domain generalization. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2022. 10.1109/cvpr52688.2022.00486. URL http://dx.doi.org/10.1109/CVPR52688.2022.00486.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Neha Kalibhat (6 papers)
  2. Sam Sharpe (1 paper)
  3. Jeremy Goodsitt (3 papers)
  4. Bayan Bruss (6 papers)
  5. Soheil Feizi (127 papers)