Bridging Domains with Approximately Shared Features
Abstract: Multi-source domain adaptation aims to reduce performance degradation when applying machine learning models to unseen domains. A fundamental challenge is devising the optimal strategy for feature selection. Existing literature is somewhat paradoxical: some advocate for learning invariant features from source domains, while others favor more diverse features. To address the challenge, we propose a statistical framework that distinguishes the utilities of features based on the variance of their correlation to label $y$ across domains. Under our framework, we design and analyze a learning procedure consisting of learning approximately shared feature representation from source tasks and fine-tuning it on the target task. Our theoretical analysis necessitates the importance of learning approximately shared features instead of only the strictly invariant features and yields an improved population risk compared to previous results on both source and target tasks, thus partly resolving the paradox mentioned above. Inspired by our theory, we proposed a more practical way to isolate the content (invariant+approximately shared) from environmental features and further consolidate our theoretical findings.
- Invariant risk minimization games. In International Conference on Machine Learning, pp. 145–155. PMLR, 2020.
- Domain-adversarial neural networks. stat, 1050:15, 2014.
- Invariant risk minimization. arXiv preprint arXiv:1907.02893, 2019.
- Recognition in terra incognita. In Proceedings of the European conference on computer vision (ECCV), pp. 456–473, 2018.
- A theory of learning from different domains. Machine learning, 79:151–175, 2010.
- Adamatch: A unified approach to semi-supervised learning and domain adaptation. In International Conference on Learning Representations, 2021.
- A theory of label propagation for subpopulation shift. In International Conference on Machine Learning, pp. 1170–1182. PMLR, 2021.
- Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE winter conference on applications of computer vision (WACV), pp. 839–847. IEEE, 2018.
- Algorithm-dependent bounds for representation learning of multi-source domain adaptation. In Ruiz, F., Dy, J., and van de Meent, J.-W. (eds.), Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, volume 206 of Proceedings of Machine Learning Research, pp. 10368–10394. PMLR, 25–27 Apr 2023. URL https://proceedings.mlr.press/v206/chen23h.html.
- Self-training avoids using spurious features under domain shift. Advances in Neural Information Processing Systems, 33:21061–21071, 2020.
- Iterative feature matching: Toward provable domain generalization with logarithmic environments. Advances in Neural Information Processing Systems, 35:1725–1736, 2022.
- How fine-tuning allows for effective meta-learning. In Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems, 2021. URL https://openreview.net/forum?id=-KGLlWv6kIc.
- Learning bounds for importance weighting. Advances in neural information processing systems, 23, 2010.
- Adaptation algorithm and theory based on generalized discrepancy. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 169–178, 2015.
- Daumé III, H. Frustratingly easy domain adaptation. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 256–263, 2007.
- The rotation of eigenvectors by a perturbation III. SIAM Journal on Numerical Analysis, 7(1):1–46, 1970. ISSN 00361429. URL http://www.jstor.org/stable/2949580. Publisher: Society for Industrial and Applied Mathematics.
- Learning to learn around a common mean. In Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., and Garnett, R. (eds.), Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018. URL https://proceedings.neurips.cc/paper_files/paper/2018/file/b9a25e422ba96f7572089a00b838c3f8-Paper.pdf.
- The advantage of conditional meta-learning for biased regularization and fine tuning. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H. (eds.), Advances in Neural Information Processing Systems, volume 33, pp. 964–974. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/0a716fe8c7745e51a3185fc8be6ca23a-Paper.pdf.
- Strong consistency, graph Laplacians, and the stochastic block model. Journal of Machine Learning Research, 22(117):1–44, 2021. URL http://jmlr.org/papers/v22/20-391.html.
- Few-shot learning via learning the representation, provably. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=pW2Q2xLwIMD.
- Unbiased metric learning: On the utilization of multiple datasets and web images for softening bias. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1657–1664, 2013.
- Domain-adversarial training of neural networks. The journal of machine learning research, 17(1):2096–2030, 2016.
- In search of lost domain generalization. In International Conference on Learning Representations, 2020.
- Beyond separability: Analyzing the linear transferability of contrastive representations to related subpopulations. Advances in Neural Information Processing Systems, 35:26889–26902, 2022.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
- Heckman, J. J. Sample selection bias as a specification error. Econometrica: Journal of the econometric society, pp. 153–161, 1979.
- Wilds: A benchmark of in-the-wild distribution shifts. In International Conference on Machine Learning, pp. 5637–5664. PMLR, 2021.
- Understanding self-training for gradual domain adaptation. In International Conference on Machine Learning, pp. 5468–5479. PMLR, 2020.
- Calibrated ensembles can mitigate accuracy tradeoffs under distribution shift. In Uncertainty in Artificial Intelligence, pp. 1041–1051. PMLR, 2022a.
- Fine-tuning can distort pretrained features and underperform out-of-distribution. In International Conference on Learning Representations, 2022b.
- Near-optimal linear regression under distribution shift. In International Conference on Machine Learning, pp. 6164–6174. PMLR, 2021.
- Just train twice: Improving group robustness without training group information. In International Conference on Machine Learning, pp. 6781–6792. PMLR, 2021a.
- Self-supervised learning is more robust to dataset imbalance. In International Conference on Learning Representations, 2021b.
- Conditional adversarial domain adaptation. Advances in neural information processing systems, 31, 2018.
- Nonlinear invariant risk minimization: A causal approach. arXiv preprint arXiv:2102.12353, 2021.
- Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE transactions on pattern analysis and machine intelligence, 41(8):1979–1993, 2018.
- Learning from failure: De-biasing classifier from biased classifier. Advances in Neural Information Processing Systems, 33:20673–20684, 2020.
- Information-theoretic regularization for multi-source domain adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9214–9223, October 2021.
- A new similarity measure for covariate shift with applications to nonparametric regression. In International Conference on Machine Learning, pp. 17517–17530. PMLR, 2022.
- Dataset shift in machine learning. Mit Press, 2008.
- Adapting visual category models to new domains. In Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part IV 11, pp. 213–226. Springer, 2010.
- An investigation of why overparameterization exacerbates spurious correlations. In International Conference on Machine Learning, pp. 8346–8356. PMLR, 2020.
- Connect, not collapse: Explaining contrastive learning for unsupervised domain adaptation. In International Conference on Machine Learning, pp. 19847–19878. PMLR, 2022a.
- Data augmentation as feature manipulation. In International conference on machine learning, pp. 19773–19808. PMLR, 2022b.
- Domain generalization via nuclear norm regularization. arXiv preprint arXiv:2303.07527, 2023.
- Shimodaira, H. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of statistical planning and inference, 90(2):227–244, 2000.
- Robustness to spurious correlations via human annotations. In International Conference on Machine Learning, pp. 9109–9119. PMLR, 2020.
- A two-stage weighting framework for multi-source domain adaptation. In Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., and Weinberger, K. (eds.), Advances in Neural Information Processing Systems, volume 24. Curran Associates, Inc., 2011. URL https://proceedings.neurips.cc/paper_files/paper/2011/file/d709f38ef758b5066ef31b18039b8ce5-Paper.pdf.
- On the theory of transfer learning: The importance of task diversity. Advances in neural information processing systems, 33:7852–7862, 2020.
- Provable meta-learning of linear representations. In International Conference on Machine Learning, pp. 10434–10443. PMLR, 2021.
- Tropp, J. A. User-friendly tail bounds for sums of random matrices. Foundations of Computational Mathematics, 12(4):389–434, August 2012. ISSN 1615-3375, 1615-3383. doi: 10.1007/s10208-011-9099-z. URL http://link.springer.com/10.1007/s10208-011-9099-z.
- Deep hashing network for unsupervised domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5018–5027, 2017.
- Wainwright, M. J. High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2019. doi: 10.1017/9781108627771.
- Deepdg: Deep domain generalization toolkit. https://github.com/jindongwang/transferlearning/tree/master/code/DeepDG.
- Deep visual domain adaptation: A survey. Neurocomputing, 312:135–153, 2018.
- Theoretical analysis of self-training with deep networks on unlabeled data. In International Conference on Learning Representations, 2020.
- Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time. In International Conference on Machine Learning, pp. 23965–23998. PMLR, 2022.
- Unsupervised data augmentation for consistency training. Advances in neural information processing systems, 33:6256–6268, 2020a.
- In-n-out: Pre-training and self-training using auxiliary information for out-of-distribution robustness. arXiv preprint arXiv:2012.04550, 2020b.
- Sample efficiency of data augmentation consistency regularization. In International Conference on Artificial Intelligence and Statistics, pp. 3825–3853. PMLR, 2023.
- Zadrozny, B. Learning and evaluating classifiers under sample selection bias. In Proceedings of the twenty-first international conference on Machine learning, pp. 114, 2004.
- Improved theoretical guarantee for rank aggregation via spectral method. arXiv preprint arXiv:2309.03808, 2023.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.