SF-UniDA: Source-Free Universal Domain Adaptation

Updated 23 January 2026

SF-UniDA is a framework that adapts models to new domains by accurately classifying shared (known) classes while robustly rejecting target-private (unknown) categories.
It employs a memory-efficient, online GMM-based pseudo-labeling strategy with entropy and likelihood tests to ensure high-precision adaptation in streaming scenarios.
Empirical results demonstrate that SF-UniDA outperforms previous methods by up to 4 H-score points on benchmarks while using constant memory regardless of stream length.

Source-Free Universal Domain Adaptation (SF-UniDA) addresses the adaptation of machine learning models to target domains that exhibit both covariate (domain) and arbitrary category (label set) shifts—without any access to source data during deployment. In SF-UniDA, the target label set may be a subset, superset, or partially overlap with the source label set, meaning some source classes may be absent, novel classes may appear, or both conditions may hold. The scenario is further complicated by real-world constraints such as online data streams (where batches arrive sequentially and cannot be revisited) and limited memory resources. Recent research has operationalized SF-UniDA as a test-time or continual adaptation problem, seeking both precise recognition of “known” (source-overlapping) classes and the robust rejection of “unknown” (target-private) categories, all in a strictly source-free and efficient manner (Schlachter et al., 2024, Schlachter et al., 16 Apr 2025, Schlachter et al., 2024, Schlachter et al., 16 Jan 2026).

1. Formal Definition and Problem Setting

Given a labeled source dataset $\mathcal{X}_s = \{(x_i^s, y_i^s)\}_{i=1}^{N_s}$ with class set $C_s$ ( $|C_s|=K_\mathrm{known}$ ), and a stream of unlabeled target samples $\mathcal{X}_t = \{x_j^t\}_{j=1}^T$ arriving in mini-batches $\{B_t\}$ , the true target label set $C_t$ is unknown and may include:

$C_\mathrm{known}=C_s$ (shared classes), and
$C_\mathrm{unknown}=C_t\setminus C_s$ (novel/unknown classes).

The adaptation task is to modify a source-trained feature extractor $f_\theta$ and classifier $g_\phi$ such that, without accessing any source data, target samples from $C_s$ are classified accurately, and those from $C_\mathrm{unknown}$ are flagged (e.g., mapped to a special “unknown” label). In online SF-UniDA, adaptation occurs in a streaming, one-pass regime, critical for embedded and real-time systems (Schlachter et al., 2024, Schlachter et al., 2024, Schlachter et al., 16 Jan 2026).

Key challenges:

Strict source-free constraint: no access to raw source samples, prototypes, or replay buffers at deployment.
Universal label space: arbitrary overlap or disjointness between source and target class sets.
Resource efficiency: memory and computation must scale at most with the number of source classes and feature dimension, not dataset size or stream length.

2. Core Methodologies: GMM-based Pseudo-Labeling and Memory Efficiency

A dominant approach for online SF-UniDA is memory-efficient pseudo-labeling using a Gaussian Mixture Model (GMM) to model the distribution of source-known class features in embedding space (Schlachter et al., 2024). Each incoming batch $B_t$ is processed as follows:

Feature extraction: Compute $z_i = f_\theta(x_i)$ for $x_i \in B_t$ .
GMM soft-assignment (E-step): For $K_\mathrm{known}$ mixture components, assign responsibilities $r_{ik}=p(k|z_i)$ based on current means $\mu_k$ , covariances $\Sigma_k$ , and weights $\pi_k$ .
Online EM M-step: Update mixture parameters with moving averages,

$N_k^{(t)} = (1-\alpha)N_k^{(t-1)} + \alpha \sum_{i=1}^m r_{ik},$

$\mu_k^{(t)} = (1-\beta)\mu_k^{(t-1)} + \beta \frac{1}{N_k^{(t)}}\sum_{i=1}^m r_{ik} z_i,$

and similarly for $\Sigma_k$ and $\pi_k$ , where $\alpha, \beta$ control the forgetting rate.

This methodology requires storage only for $O(K_\mathrm{known}\cdot d^2)$ floats (class means, covariances, weights), achieving constant space over long streams, unlike buffer- or prototype-based methods.

Pseudo-labels for adaptation are generated via a two-stage criterion:

Entropy test: Accept only those target samples whose classifier softmax entropy $H(\hat{p}_i)\leq\tau_H$ .
GMM likelihood test: Accept only those with $p(z_i)\geq\tau_L$ . Samples failing either test are tagged as “unknown” and excluded from supervised adaptation (Schlachter et al., 2024).

This high-precision pseudo-labeling is critical for online adaptation, as shown in controlled analyses quantifying the substantial gap between the performance of current state-of-the-art SF-UniDA (e.g., GMM-based) and the theoretical upper bound achieved with perfect pseudo-labels (Schlachter et al., 16 Apr 2025).

3. Adaptation Losses, Objective Functions, and Robustness to Label Noise

Accepted pseudo-labeled samples are used to compute adaptation gradients. The principal adaptation loss in memory-efficient SF-UniDA is a batch-level supervised contrastive loss: $L_\mathrm{contrastive} = -\frac{1}{|A_t|}\sum_{i\in A_t} \frac{1}{|S_+(i)|}\sum_{j\in S_+(i)} \log\frac{\exp( \mathrm{sim}(z_i, z_j)/\tau )}{ \sum_{k\in A_t\setminus\{i\}} \exp(\mathrm{sim}(z_i, z_k)/\tau) }$ where $S_+(i)$ indexes other batch features with the same pseudo-label, and $\mathrm{sim}(\cdot, \cdot)$ denotes cosine similarity.

A KL-divergence term softly aligns the model's predicted distribution with a uniform prior over $K_\mathrm{known}$ classes: $L_\mathrm{KL} = \mathrm{KL}(u||p_t), \quad u(c)=1/K_\mathrm{known}, \quad p_t(c)=\frac{1}{m}\sum_{i=1}^m \hat{p}_i[c] .$

The total loss per batch is $L_\mathrm{total} = L_\mathrm{contrastive} + \lambda L_\mathrm{KL}$ , with $\lambda$ tuned on a source-free validation set (Schlachter et al., 2024). This combination is shown to be robust to noisy pseudo-labels; contrastive losses can drive adaptation even at moderate pseudo-label accuracy (30–40%), while cross-entropy losses are brittle and only outperform when label quality approaches perfection (Schlachter et al., 16 Apr 2025).

Empirical analyses demonstrate that quality of pseudo-labels is markedly more important than quantity: high-confidence pseudo-labeling with rejection of ambiguous samples produces superior adaptation—even at the cost of smaller, cleaner batches (Schlachter et al., 16 Apr 2025).

4. Benchmarks, Empirical Results, and Memory Analysis

Standard evaluation uses large-scale benchmarks (DomainNet: 345 classes; Office-Home: 65 classes; VisDA-C: 12 synthetic→real classes) and a universal split: 50% of target classes are unseen (unknown) during adaptation (Schlachter et al., 2024).

Results (average H-score, harmonic mean of known accuracy and unknown F1):

Dataset	SF-UAN	OSFDA	GMM (Ours)
DomainNet	41.2	43.7	46.5
Office-Home	53.8	55.4	58.1
VisDA-C	65.1	66.8	69.3

The GMM-based method outperforms previous online SF-UniDA approaches by 2–4 H-score points, while using 3× less memory. Storage is $O(K_\mathrm{known}\cdot d^2)$ (diagonal covariance often used; $d$ is embedding dimension), as compared to buffer approaches requiring $O(N_s d)$ or $O(K_\mathrm{known} m d)$ . For $d\approx256$ , $K_\mathrm{known} \leq 50$ , only $\sim10^4$ floats are needed (Schlachter et al., 2024).

5. Comparison with Other Paradigms and Relations to Broader SF-UniDA Literature

The GMM-based pseudo-labeling paradigm is contrasted with:

Clustering-based methods (e.g., GLC, GLC++): which apply global one-vs-all and local kNN clustering, with extensions to latent structure discovery in unknowns via contrastive affinity learning (Qu et al., 2024). These aim to explicitly cluster the “unknown” class but often have higher memory/computation cost.
Orthogonal feature decomposition (LEAD): decouples features into source-known and -unknown subspaces, with instance-adaptive boundaries for unknown detection via GMM on projection norms (Qu et al., 2024).
Mean Teacher frameworks (COMET, GMM-COMET): leverage online pseudo-labeling with teacher-student consistency, contrastive clustering, and additional entropy or source-consistency regularization for stability in highly dynamic or continual streams (Schlachter et al., 2024, Schlachter et al., 16 Jan 2026).
Vision-language based/self-training methods (COCA, CausalDA): utilize frozen CLIP backbones and classifier calibration via prototypes or causal prompt tuning. These methods focus on classifier adaptation with language priors and are particularly effective in few-shot or generalized scenarios (Liu et al., 2023, Tang et al., 2024).

SF-UniDA is systematically reviewed in surveys that categorize model-centric and data-centric approaches, elucidate the challenges of robust unknown detection, adaptive thresholding, and scalable clustering without source data or labels (Yu et al., 2023).

6. Limitations, Open Problems, and Future Directions

Despite recent progress, critical challenges remain:

Hyperparameter sensitivity: Entropy and likelihood thresholds for pseudo-label acceptance ( $\tau_H$ , $\tau_L$ ) require tuning, often on small source-free validation sets. Automatic or self-supervised threshold selection remains an open problem (Schlachter et al., 2024).
Unknown class modeling: Current GMM-based methods cannot discover or cluster target-unknown classes, as the number of components is fixed to $K_\mathrm{known}$ (Schlachter et al., 2024, Schlachter et al., 2024).
Low-data regimes: Per-class covariance or mixture estimation may be unstable for small batches (Schlachter et al., 2024).
Continual label shifts: Most frameworks assume a fixed number of known/unknown classes throughout the stream; adaptation to dynamic, evolving label spaces is underexplored (Schlachter et al., 16 Jan 2026).
Extension beyond classification: Applications to structured outputs (segmentation, detection), or multimodal domains, are open research directions (Yu et al., 2023).

Potential solutions discussed in the literature include nonparametric mixture modeling, adaptive thresholding by self-supervised or auxiliary validation, and more expressive per-class feature models (full-rank covariances, class-conditional attention) (Schlachter et al., 2024, Schlachter et al., 16 Jan 2026). Extensions to scalable continual learning, dynamic category discovery, and deeper integration of language priors are identified as promising future directions.

7. Significance and Impact

SF-UniDA—particularly with efficient GMM-based online pseudo-labeling—has set new memory and accuracy benchmarks for universal adaptation in privacy-sensitive, streaming, or embedded environments (Schlachter et al., 2024). Empirical and theoretical analyses establish the centrality of high-precision pseudo-labeling, robustness to moderate noise through contrastive objectives, and the scalability of constant-memory, purely online updates (Schlachter et al., 2024, Schlachter et al., 16 Apr 2025, Schlachter et al., 2024, Schlachter et al., 16 Jan 2026). The paradigm provides a practical foundation for robust, low-resource deployment of adaptive models under real-world nonstationarity and open-world semantic drift.