Instance-aware Pseudo-label Selection

Updated 25 October 2025

Instance-aware pseudo-label selection is a technique that assigns pseudo-labels based on instance-specific criteria such as confidence scores, loss statistics, and geometric features.
It leverages optimization formulations and instance-level thresholding to refine label assignments, resulting in enhanced accuracy and robustness in weakly supervised scenarios.
Empirical studies demonstrate that these methods improve performance in tasks like partial label learning, segmentation, and domain adaptation by reducing noise and handling ambiguity.

Instance-aware pseudo-label (IPL) selection encompasses a class of algorithms and learning strategies in which pseudo-label assignment or filtering depends explicitly on the properties of individual instances, rather than global heuristics or uniform thresholds. IPL methods are critical for learning scenarios with weak, partial, or noisy supervision and have been formalized for a variety of problem settings including partial label learning, semi-supervised learning, multi-label refinement, and domain adaptation. IPL strategies are characterized by mathematical formulations and optimization procedures that particularize the pseudo-label assignment, selection, or update for each instance—typically leveraging confidence scores, loss statistics, geometric relationships, or probabilistic guarantees. This article surveys the core principles, representative methodologies, theoretical results, and empirical evidence underpinning instance-aware pseudo-label selection.

1. Mathematical and Optimization Frameworks for IPL

Instance-aware pseudo-label selection is most clearly distinguished by optimization problems formulated at the instance level. In partial label learning, the self-guided retraining (SURE) formulation introduces a confidence matrix $P$ where each row $p_i$ is an instance-specific pseudo-label probability vector. The objective

$\min_{P,W,b} \sum_{i=1}^m [L(x_i, p_i, f) - \lambda \|p_i\|_\infty] + \beta \Omega(f)$

is subject to per-instance constraints: $p_{ij} \in [0, y_{ij}]$ and $\sum_j p_{ij} = 1$ for all $i$ (Feng et al., 2019). The $\ell_\infty$ norm regularization $-\lambda \|p_i\|_\infty$ enforces that a single candidate per instance is favored, leading to an implicit disambiguation for each $x_i$ .

In iterative label cleaning for few-shot learning, instance-aware pseudo-labels are obtained by propagating labels across a graph constructed on the support and query set features, followed by a loss-based selection on per-query empirical consistency (Lazarou et al., 2020). Similarly, multi-label refinement approaches use an optimization over pseudo-label matrices indexed by instance, with assignments updated per instance based on local gradients of validation loss (Hsieh et al., 2021).

Bayesian approaches such as BPLS frame selection as an instance-level decision problem: $\text{Select } (x_i, \hat{y}_i) \text{ maximizing } \mathbb{E}_{\theta \sim \pi}[ p(\mathcal{D} \cup (x_i, \hat{y}_i) \mid \theta) ]$ enabling full exploitation of per-instance uncertainty and model fit (Rodemann et al., 2023, Rodemann, 2023).

2. Instance-level Confidence, Thresholding, and Ambiguity Modeling

Several IPL techniques assign or filter pseudo-labels based on per-instance confidence, ambiguity, or calibration statistics. InstanT develops instance-dependent thresholds $\tau(x)$ for each unlabeled instance $x$ , computed in closed form using local posterior estimates and transition matrices: $\tau(x) = T_{k,k}(x) P(Y=s\mid x) + \sum_{i\neq k} T_{i,k}(x) P(Y=i\mid x) + \kappa_t$ where $T$ encodes instance-dependent noise and $P$ derives from the current model's predictions (Li et al., 2023). In contrast with rigid, global thresholds, this construction enables IPL to control error rates at the instance level and provides probabilistic guarantees on correct assignment.

Decoupled dual-threshold filtering in semi-supervised instance segmentation independently thresholds class ( $c_k$ ) and mask ( $m_k$ ) qualities at the instance level, ensuring both properties of a pseudo-label are met by each detected object (Lin et al., 16 May 2025). Confidence separable learning (CSL) in semantic segmentation further combines per-instance maximum confidence and "residual dispersion" (the spread of non-maximum predicted probabilities) to define a two-dimensional feature space for convex instance-level selection (Liu et al., 20 Sep 2025).

3. Loss-based, Graph-based, and Contrastive Instance-aware Selection

IPL methods frequently rely on loss values or prediction distributions computed per instance to guide pseudo-label selection or refinement. In iterative label cleaning, a parameterically simple classifier is trained on both labeled and provisionally pseudo-labeled data; instances with the lowest average cross-entropy loss are preferentially selected, exploiting the empirical observation that cleanly-labeled data exhibit lower loss (Lazarou et al., 2020).

Manifold-based algorithms, such as those adopting label propagation over affinity graphs, perform balancing and normalization over the instance-level distributions, with subsequent cleaning based on class balance and per-instance loss statistics. Loss-based selection also appears in memory replay strategies for incremental partial label learning, where representative and diverse instances are identified as those closest to or most characteristic of class prototypes, computed by feature aggregation (Wang et al., 23 Jan 2025).

In multitask and multitarget settings (e.g., electron microscopy segmentation), detection outputs are used to guide instance selection: confident center points (from a detection head) define connected regions in the segmentation mask, yielding semantically grouped IPLs rather than pixelwise or class-specific heuristics (Xiong et al., 18 Oct 2025). Contrastive approaches construct query-prototype relationships per instance (or per pixel) and align instance-level features with class-specific prototypes to enforce discriminative and domain-invariant representations.

4. Surrogates, Regularization, and Efficiency Considerations

Solving instance-aware selection problems exactly can be computationally costly. The SURE algorithm mitigates this by introducing an upper-bound surrogate objective, which—by restricting attention to the candidate label with the highest model output—reduces the per-instance optimization from $l$ quadratic programs to a single, efficiently solvable QP (Feng et al., 2019).

Regularization strategies often act at the instance level. In semantic segmentation, region-adaptive regularization distinguishes "confident" pseudo-labeled regions and "ignored" regions, applying smoothness (KL divergence from the uniform) or sharpening (entropy minimization) losses, respectively (Zhu et al., 2023). Mean teacher (EMA) consistency constraints are introduced on ignored (unreliable) regions to propagate reliable context and reduce label noise (Zhu et al., 2023, Liu et al., 20 Sep 2025).

Dynamic correction modules, as in the PL-DC framework, use visual-language alignment (e.g., CLIP) to revise noisy class assignments adaptively, with the fusion weight decaying over training. This ensures correction is instance- and stage-specific, aiding robustness under label ambiguity or class confusion (Lin et al., 16 May 2025).

5. Theoretical Guarantees and Robustness in Instance-aware Selection

IPL selection has been augmented with rigorous theoretical backing. Reduction-based pseudo-label generation proves that, for instance-dependent partial label learning, pseudo-labels constructed via aggregation from auxiliary model branches trained with label exclusions yield a higher probability of matching the Bayes optimal classifier than direct model self-training (Qiao et al., 28 Oct 2024). Under multiclass Tsybakov conditions, lower bounds are established for consistency: $p(\eta^*(x) = \operatorname{argmax}_j q'_j) \geq 1 - C[O(\varepsilon \varepsilon')]^\lambda$ where $q'$ denotes the reduction-based pseudo-label and $\varepsilon, \varepsilon'$ parameterize model and auxiliary errors.

Instance-dependent threshold methods provide guarantees that pseudo-labels assigned above dynamically computed thresholds satisfy correctness at a rate lower-bounded by functionals of model error and class margin parameters (Li et al., 2023).

Bayesian instance-aware selection (BPLS) penalizes model overconfidence via the log-determinant of the Fisher information matrix, suppressing confirmation bias. This robustifies learner behavior in high-dimensional and overfitting-prone regimes and admits generalization to non-i.i.d. settings and multi-objective tradeoffs (Rodemann et al., 2023, Rodemann, 2023).

6. Empirical Impact and Practical Applications

Empirical evaluations across tabular, image, text, and speech domains consistently show that IPL selection strategies outperform global or class-level heuristics, especially as label ambiguity, annotation noise, and class imbalance increase. On partial label benchmarks, the SURE approach maintains superior performance as the noise or number of ambiguous labels rises, attributed to instance-aware confidence optimization (Feng et al., 2019). In vision-language prompt tuning, candidate set selection via intra- and inter-instance confidence statistics improves true label inclusion rates and class balance relative to top-1 hard thresholding (Zhang et al., 15 Jun 2024).

For segmentation and detection, instance-aware strategies leveraging detection outputs deliver substantial gains in panoptic quality, Dice, and AJI metrics, closing the gap with fully-supervised upper bounds while requiring only sparse point annotations (Xiong et al., 18 Oct 2025). On semi-supervised instance segmentation, dual-threshold filtering, CLIP-driven category correction, and pixel-level uncertainty reweighting cumulatively yield large mAP improvements under severe annotation scarcity (Lin et al., 16 May 2025).

In incremental and memory-constrained settings, combining IPL with prototype-based memory selection and momentum-based pseudo-labeling reduces catastrophic forgetting and enhances new class generalization across sequential tasks (Wang et al., 23 Jan 2025).

7. Open Problems and Future Directions

While IPL selection has demonstrated clear utility, computational overhead and complexity may arise from instance- or candidate-wise optimization (e.g., QP solving, auxiliary multi-branch modeling, or Bayesian integration). Prospective advances include more efficient incremental update and approximation schemes, integration with model selection objectives, and scalable deployment for large-scale deep network architectures (Rodemann et al., 2023, Qiao et al., 28 Oct 2024).

Extensions to non-i.i.d. domains, such as sequence data, graphs, and temporally evolving distributions (e.g., temporal action localization (Zhou et al., 10 Jul 2024)), are ongoing. Dynamic weighting, adaptive stage selection, and model-agnostic aggregation rules are further areas for exploration (Bala et al., 6 Dec 2024, Li et al., 2023).

A plausible implication is that broadening IPL selection to leverage multiple data modalities and auxiliary information—such as language semantics, inter-instance geometry, or Bayesian priors—will continue to improve robustness, annotation efficiency, and domain adaptation in weakly supervised, noisy, or evolving tasks.

Overall, instance-aware pseudo-label selection constitutes a rigorously grounded, empirically validated, and practically impactful approach that is central to the next generation of semi-supervised and weakly supervised learning systems.