Source-Free Domain Adaptation

Updated 14 February 2026

Source-Free Domain Adaptation is a framework that adapts a source-trained model to a target domain without accessing the original source data, ensuring data privacy and addressing domain shifts.
Its methodologies include self-training, pseudo-labeling, clustering, and contrastive learning to enforce consistency and robust feature alignment in the target data.
Empirical results demonstrate that SFDA converges faster, outperforms traditional UDA on benchmarks, and generalizes across diverse modalities and domain shift scenarios.

Source-Free Domain Adaptation (SFDA) is a paradigm in domain adaptation that aims to address the domain shift problem under the constraint that the user may only access a source-trained model, but not the underlying source data, when adapting to a target domain. SFDA has gained prominence due to increasing privacy, legal, and data-ownership restrictions that preclude researchers or practitioners from sharing data across institutional boundaries, while the distributional mismatch between source and target remains a critical barrier to reliable performance in real-world deployments (Yu et al., 2023, Wang et al., 2024).

1. Formal Problem Setting and Foundational Principles

SFDA considers the scenario where a model $f_{S}$ , pre-trained on a labeled source dataset $D_S = \{ (x^S_i, y^S_i) \}_{i=1}^{n_S}$ , must be adapted to an unlabeled target domain $D_T = \{ x^T_j \}_{j=1}^{n_T}$ under the restriction that the original source data $D_S$ cannot be revisited. The available information comprises the full weights of the source-trained model and the unlabeled target samples (Yu et al., 2023).

The adaptation objective is to learn an updated model $f_T(\cdot ; \theta_T)$ that minimizes the target risk,

$\epsilon_{P_T}(f_T) = \mathbb{E}_{x \sim P_T} [ \mathbf{1}\{ f_T(x) \ne y^*(x) \} ]$

using only the accessible source model parameters ( $\theta_S$ ) and unlabeled target data ( $D_T$ ). The challenge is compounded by the unavailability, and thus inaccessibility, of joint source and target domain data—a critical differentiator from traditional domain adaptation (Yu et al., 2023, Wang et al., 2024).

The SFDA formulation encompasses multiple domain-shift scenarios:

Closed-set: Source and target share the same label space.
Open-set: Target contains additional, unknown classes.
Partial-set: Only a subset of source classes appears in target.
Generalized/Universal SFDA: Both label shift and covariate shift may exist, with no prior knowledge of class overlap (Tang et al., 2024).

2. Algorithmic Frameworks and Methodological Taxonomy

SFDA methods can be broadly categorized according to their modeling strategy (Yu et al., 2023, Yang et al., 2023, Yang et al., 2022, Zhang et al., 2022):

Self-Training and Pseudo-labeling: Models generate pseudo-labels for target data—typically via nearest-neighbor assignment to frozen classifier weights or cluster centroids—and update the feature extractor by enforcing consistency between predicted labels and pseudo-labels. This family includes methods such as SHOT, which combines pseudo-label self-training with information maximization to encourage confident and diverse predictions (Yu et al., 2023, Yang et al., 2023).
Neighborhood Consistency/Clustering: These approaches (e.g., NRC, AaD, DaC) exploit the manifold structure of the target domain by enforcing prediction consistency among proximate (usually mutual nearest-neighbor) samples in feature space. Reciprocal neighbors are assigned higher affinity since empirical analysis shows RNNs are more likely to share a true class compared to standard KNNs (Yang et al., 2023, Yang et al., 2022, Zhang et al., 2022).
Contrastive and Spectral Methods: Recent methods (e.g., SF(DA) $^2$ , DaC) exploit graph-theoretic relationships by constructing augmentation graphs or clustering predictions via spectral methods, effectively replacing explicit data augmentation with geometry-driven regularization in feature space (Hwang et al., 2024, Zhang et al., 2022).
Model Uncertainty and Regularization: Some frameworks introduce Bayesian modeling or Laplace approximations on model weights to quantify and propagate prediction uncertainty during adaptation, improving robustness in the presence of false pseudo-labels or OOD samples (Roy et al., 2022).
Attention and Self-Distillation: Architectures like ARFNet incorporate residual attention fusion and channel- or spatial-wise attention mechanisms within deep networks, further combined with dynamic centroid evaluation and self-distillation for robust cluster formation in the target domain (Shao et al., 25 Oct 2025).
Multimodal/Foundation Model Distillation: Recent advances leverage vision-LLMs such as CLIP to extract external causal factors or serve as frozen multimodal teachers. Alignment is refined through mutual information maximization and subsequent knowledge distillation to the target-adapted model (Tang et al., 2024, Tang et al., 2023, Zhang et al., 2024, Zhang et al., 2022).

A summary of prominent methods, representative papers, and their technical focus is presented below:

Method/Framework	Core Principle	Reference
SHOT	Pseudo-labeling + IM	(Yu et al., 2023)
NRC	Reciprocal neighbor clustering	(Yang et al., 2023)
AaD	Prediction consistency (attract/dispersion)	(Yang et al., 2022)
DaC	Divides domain, adaptive contrastive	(Zhang et al., 2022)
SF(DA) $^2$	Spectral, augment graph	(Hwang et al., 2024)
U-SFAN	Bayesian uncertainty	(Roy et al., 2022)
ARFNet	Attention, residual fusion	(Shao et al., 25 Oct 2025)
DIFO, LCFD	Foundation model distillation	(Tang et al., 2024, Tang et al., 2023)
Co-learn	Leveraging pre-trained features	(Zhang et al., 2022, Zhang et al., 2024)

3. Objective Functions and Optimization Strategies

SFDA methods are united by the inability to use source data for explicit cross-domain alignment. Consequently, the objective functions incorporate the following regularizers and loss terms (Yu et al., 2023, Yang et al., 2023, Yang et al., 2022, Zhang et al., 2022, Tang et al., 2024):

Information Maximization (IM): Encourages predictions with low entropy (confidence) and high batch entropy (diversity), as in

$\mathcal{L}_{IM} = -\frac{1}{n_T} \sum_{i=1}^{n_T} \sum_{c=1}^C p_{i,c} \log p_{i,c} + \sum_{c=1}^C \bar{p}_c \log \bar{p}_c,$

where $\bar{p}_c$ is the predicted batch mean for class $c$ (Yu et al., 2023, Yang et al., 2023).

Cluster Consistency / Reciprocal Neighborhood Loss:

Methods construct a KNN or RNN graph on target features and minimize

$\mathcal{L}_{\mathcal{N}} = - \frac{1}{n_T} \sum_{i=1}^{n_T} \sum_{j \in \mathcal{R}_K(i)} \langle p_i, p_j \rangle$

where $\mathcal{R}_K(i)$ is the reciprocal neighbor set (Yang et al., 2023, Zhang et al., 2022, Yang et al., 2022).

Contrastive/Spectral Loss:

Spectral graph approaches replace pseudo-label computations with clustering-friendly objectives, encouraging tight intra-cluster similarity and repulsion of non-neighbors (Hwang et al., 2024).

Mixup and Intermediate Region Regularization:

Some methods (e.g., CNG-SFDA) introduce mixup of features between clean and noisy (uncertain) regions to increase cluster compactness and reduce the impact of label noise (Cho et al., 2024).

Uncertainty-weighted Losses:

Bayesian extensions attach confidence scores (e.g., negative entropy or Laplace uncertainty) to downweight unreliable target samples within IM or PL objectives (Roy et al., 2022).

Mutual Information and Causal Factor Discovery:

In unified or multimodal settings, mutual information between target model outputs and large-scale foundation model predictions (e.g. CLIP) is maximized, with further losses for causal disentanglement or predictive consistency (Tang et al., 2024, Tang et al., 2023).

BatchNorm-Statistics Matching:

Model-centric adaptation through aligning the batch normalization statistics of the encoder output to stored source statistics is an alternative when explicit labels are unavailable (Ishii et al., 2021, Hou et al., 2020).

4. Empirical Results, Impact, and Comparative Evaluation

SFDA consistently demonstrates robust and often superior performance relative to standard Unsupervised Domain Adaptation (UDA) approaches, particularly in regimes with significant source-target shift or when data privacy concerns preclude source data usage (Wang et al., 2024, Yu et al., 2023, Yang et al., 2023).

Salient empirical findings include:

SFDA vs. UDA: Comprehensive experiments reveal that SFDA generally matches or outperforms UDA, especially when negative transfer risks are high or storage constraints are tight. For example, SFDA methods typically converge in ∼200 iterations versus 1,000–5,000 for UDA, and their target-focused objectives help avoid performance degradation caused by outlier source data (Wang et al., 2024).
State-of-the-Art Gains: Leading SFDA frameworks (e.g., NRC, SF(DA) $^2$ , LCFD, DIFO, Co-learn) routinely deliver top-tier performance on benchmarks such as Office-31, Office-Home, VisDA, and DomainNet (Tang et al., 2024, Yang et al., 2023, Hwang et al., 2024, Zhang et al., 2022).
Generalization across Modalities: Some methods, such as NOTELA, have demonstrated effectiveness not only on computer vision tasks but also in bioacoustics, indicating cross-modal generalizability (Boudiaf et al., 2023).
Open-/Partial-/Generalized SFDA: Unified frameworks now address open-set and partial-set scenarios, e.g., LCFD achieves >6% average gain over best prior on open/partial-set Office-Home and significant improvements on source-free OOD generalization tasks (Tang et al., 2024).
Multi-source and Data/Model Fusion: New multi-source SFDA and hybrid data-model fusion strategies (e.g., MEA weighting) further extend applicability to collaborative or federated contexts, outperforming classic multi-source UDA by a wide margin (Wang et al., 2024).

5. Challenges and Limitations

While SFDA has made substantial empirical progress, the field acknowledges a set of persistent and open technical issues (Yu et al., 2023, Wang et al., 2024, Boudiaf et al., 2023):

Pseudo-label Noise and Confirmation: Pseudo-labeling strategies are susceptible to noise amplification. Approaches that partition clean and noisy regions or propagate confidence weights aim to mitigate this, yet robust statistical guarantees remain limited (Cho et al., 2024, Zhang et al., 2022, Roy et al., 2022).
Evaluation on Real-World Shifts: Many SFDA methods are tuned and tested on well-curated, relatively balanced datasets. Performance is less predictable on highly imbalanced, multi-label, or out-of-distribution settings, as observed in cross-modality studies (Boudiaf et al., 2023).
Hyperparameter Sensitivity: Dependence on confidence thresholds, neighbor counts, or loss weightings can require careful calibration. Autonomous or theory-driven parameter selection is an active direction (Zhang et al., 2022, Yang et al., 2022, Yang et al., 2023).
Limited Theoretical Analysis: Few generalization guarantees are available for generic SFDA objectives given the absence of source data. Progress is being made in the direction of causal factor analysis and formal links to mutual information or clustering theory (Tang et al., 2024).
Domain Gap Estimation and Outlier Detection: Without source data, estimating domain discrepancy or identifying class distribution mismatch is challenging. Work on uncertainty quantification and dynamic region selection addresses this issue in part (Roy et al., 2022, Cho et al., 2024).
Memory and Computation: Methods based on explicit memory banks or feature graphs incur O( $n_T$ ) cost and require efficient search techniques for large-scale or streaming domains (Zhang et al., 2022, Yang et al., 2023, Hwang et al., 2024).

6. Emerging Directions and Prospects

Research on SFDA has expanded in both methodological depth and domain coverage:

Unified and Generalized SFDA: Formulations capable of simultaneously addressing closed-set, open-set, partial-set, and continual/online adaptation with anti-forgetting regularization are now available (e.g., LCFD), often leveraging foundation models and causal inference (Tang et al., 2024, Tang et al., 2023).
Integration of Multimodal Knowledge: Distillation from frozen vision-LLMs to target-domain models via mutual information and category-encouragement regularization provides substantial boosts, particularly for settings with large semantics or highly imbalanced label spaces (Tang et al., 2023, Zhang et al., 2024).
Data-Model Fusion and Federated SFDA: In real-world collaborative contexts, merging adaptation signals from both raw datasets and source models proves superior to classic UDA or SFDA in isolation. Principled weighting and routing strategies, such as MEA, are crucial for optimal performance (Wang et al., 2024).
Beyond Image Classification: SFDA is now being applied to object detection (e.g., SFDLA for document layout), semantic segmentation, video analytics, and audio understanding, with architectural adaptations for dual-teacher distillation, consensus pseudo-labeling, and attention fusion (Tewes et al., 24 Mar 2025, Shao et al., 25 Oct 2025, Boudiaf et al., 2023).
Theoretical Developments: Increasing focus on deriving risk bounds, causal justifications for adaptation, and stability analyses under limited domain knowledge can be observed in recent work (Tang et al., 2024, Yu et al., 2023).

A plausible implication is that SFDA is emerging as a foundational paradigm for model deployment in privacy-restricted, dynamic, and multi-source/multi-domain real-world environments, with ongoing research extending its reach and robustness.

Selected References

(Yu et al., 2023) A Comprehensive Survey on Source-free Domain Adaptation
(Wang et al., 2024) Unveiling the Superior Paradigm: A Comparative Study of Source-Free Domain Adaptation and Unsupervised Domain Adaptation
(Yang et al., 2023) Trust your Good Friends: Source-free Domain Adaptation by Reciprocal Neighborhood Clustering
(Zhang et al., 2022) Divide and Contrast: Source-free Domain Adaptation via Adaptive Contrastive Learning
(Tang et al., 2023) Source-Free Domain Adaptation with Frozen Multimodal Foundation Model
(Tang et al., 2024) Unified Source-Free Domain Adaptation
(Roy et al., 2022) Uncertainty-guided Source-free Domain Adaptation
(Tewes et al., 24 Mar 2025) SFDLA: Source-Free Document Layout Analysis
(Zhang et al., 2024) Source-Free Domain Adaptation Guided by Vision and Vision-Language Pre-Training
(Shao et al., 25 Oct 2025) Attention Residual Fusion Network with Contrast for Source-free Domain Adaptation
(Hwang et al., 2024) SF(DA) $^2$ : Source-free Domain Adaptation Through the Lens of Data Augmentation