Proxy Methods for Domain Adaptation (2403.07442v1)

Published 12 Mar 2024 in cs.LG and stat.ML

Abstract: We study the problem of domain adaptation under distribution shift, where the shift is due to a change in the distribution of an unobserved, latent variable that confounds both the covariates and the labels. In this setting, neither the covariate shift nor the label shift assumptions apply. Our approach to adaptation employs proximal causal learning, a technique for estimating causal effects in settings where proxies of unobserved confounders are available. We demonstrate that proxy variables allow for adaptation to distribution shift without explicitly recovering or modeling latent variables. We consider two settings, (i) Concept Bottleneck: an additional ''concept'' variable is observed that mediates the relationship between the covariates and labels; (ii) Multi-domain: training data from multiple source domains is available, where each source domain exhibits a different distribution over the latent confounder. We develop a two-stage kernel estimation approach to adapt to complex distribution shifts in both settings. In our experiments, we show that our approach outperforms other methods, notably those which explicitly recover the latent confounder.

Citations (2)

View on Semantic Scholar

Summary

The paper introduces a proxy variable framework using proximal causal learning to achieve domain adaptation without directly modeling latent confounders.
It establishes new identifiability results and develops two-stage kernel estimators applicable to both Concept Bottleneck and Multi-Domain settings.
Empirical results on synthetic and real datasets demonstrate the method’s superiority over traditional approaches under varying latent shifts.

Proxy Methods for Domain Adaptation Under Latent Shift

Introduction

Domain adaptation addresses the problem of applying a model trained in one domain to a distinct, albeit related, target domain, especially when the target domain lacks labels. This challenge is accentuated when shifts between domains are attributed to changes in the distribution of unobserved, latent variables that confound both input features and labels. Traditional approaches make strong assumptions about the type of shift or rely on explicit modeling of latent variables, which may not be feasible or accurate in complex real-world settings.

Proxy Methods for Causal Learning

This paper proposes a novel framework leveraging proximal causal learning to perform domain adaptation under latent shifts without the need to directly identify or model the latent confounder. The approach utilizes proxy variables, which indirectly provide information about the latent confounder, enabling adaptation to distribution shifts. Two scenarios are considered: the Concept Bottleneck setting, where an additional "concept" variable mediates the relationship between features and labels, and the Multi-Domain setting, with multiple source domains but no explicit concept mediator.

Identification and Estimation

Under the latent shift assumption, traditional domain adaptation guarantees based on covariate or label shift assumptions are invalidated. The proposed method establishes new identifiability results, showing that the optimal target prediction function can be identified using proxy variables under certain conditions without directly estimating the latent variable's distribution. This is achieved through the construction of bridge functions, allowing the translation of adaptation tasks in both Concept Bottleneck and Multi-Domain settings into solvable estimation problems.

Theoretical Contributions

Demonstrated that the optimal predictor in the target domain can be identified using proxy methods without making restrictive assumptions on the nature of the latent confounder or its distribution.
Developed practical two-stage kernel estimators for adaptation, facilitating application to complex distribution shifts.
Presented new identification theorems for domain adaptation under latent shifts, extending the applicability of proximal causal inference methods to a broader class of adaptation scenarios.

Empirical Validation

Experimental results on synthetic data and the dSprites dataset illustrate the proposed method's efficacy compared to existing adaptation approaches, including empirical risk minimization and methods explicitly modeling latent variables. In scenarios with both concept variables and in multi-domain settings, the proposed method consistently outperformed baselines, demonstrating robustness against various degrees of latent shifts.

Implications and Future Directions

The adaptation techniques under latent shifts offer a promising direction for domain adaptation research, especially in situations where the latent structure underlying distribution shifts cannot be easily characterized or observed. Future work could explore extending these methods to continuous latent variables and investigating their effectiveness in more complex, real-world datasets beyond the settings of controlled experiments.

Conclusion

This work takes a significant step towards addressing domain adaptation under the challenging condition of latent shifts. By harnessing proximal causal learning and proxy variables, it offers a robust framework for domain adaptation without the stringent requirements of modeling latent confounders directly. This approach has profound implications for various applications, including healthcare and image processing, where domain shifts are prevalent, but latent variables driving these shifts are not directly observable.

PDF Markdown

Related Papers

Tweets

https://twitter.com/ArthurGretton/status/1769660647476981792

https://twitter.com/StatMLPapers/status/1767763433385009572