Curriculum Pseudo Labeling (CPL)
- Curriculum Pseudo Labeling (CPL) is a semi-supervised strategy that progressively integrates pseudo-labels by gradually increasing difficulty, effectively mitigating confirmation bias and class imbalance.
- It dynamically adjusts selection thresholds based on per-class, instance-level statistics, enabling tailored data integration across various modalities and learning tasks.
- CPL improves convergence and accuracy in semi-supervised learning, as demonstrated by significant error reductions and enhanced performance in image classification, domain adaptation, and other domains.
Curriculum Pseudo Labeling (CPL) is a semi-supervised and domain-adaptive learning strategy in which pseudo-labels assigned to unlabeled data are selectively incorporated into training according to a difficulty-ordered schedule—the curriculum. CPL methods modulate which pseudo-labels are accepted for training by dynamically adjusting selection thresholds or weights, based on the learning progress for individual classes, samples, or the model as a whole. This enables reliable and efficient use of unlabeled data, mitigates confirmation bias, and accelerates convergence, particularly in scenarios where labeled data are scarce or data distributions are imbalanced. Originally introduced in the context of image classification with FlexMatch, CPL techniques have since been extended to diverse domains including domain adaptation, regression, multi-label classification, sequence prediction, graph learning, and robotic control.
1. Foundational Motivation and Comparison to Fixed-Threshold Approaches
The standard pseudo-labeling paradigm, as exemplified by FixMatch, admits into training only those unlabeled samples whose predicted label confidence exceeds a fixed scalar threshold across all classes and throughout training. This strategy is fundamentally limited: (a) at early training stages, few samples exceed the high threshold, slowing learning; (b) classes with inherently higher difficulty or imbalances are systematically underrepresented among the pseudo-labeled pool, while "easy" classes dominate, reinforcing confirmation bias and class imbalance (Zhang et al., 2021, Kage et al., 2024).
CPL addresses these deficiencies by (i) defining per-class, instance-level, or adaptive thresholds that change over training, thereby enabling class-wise or even pixel-wise hard/easy progression; and (ii) establishing a schedule or rationale (the "curriculum") for growing the set of trusted pseudo-labels—starting with those that are easier or more reliable, then incorporating incrementally more difficult examples as the model matures (Zhang et al., 2021, Kim et al., 2023). This curriculum learning framework enables the model to maximize utilization of the unlabeled pool without sacrificing label quality.
2. Core Mathematical Formulations and Algorithms
CPL encompasses a family of formulations. The most canonical, as in FlexMatch, defines for each class a running statistic of the model’s "learning effect" —the count of samples most recently pseudo-labeled as with confidence exceeding the fixed threshold (Zhang et al., 2021). This is normalized to obtain . A monotonic mapping (e.g., ) generates the class-wise dynamic threshold: . Unlabeled samples 0 with predicted top-class 1 and confidence exceeding 2 are selected for training.
The pseudo-code for integrating CPL into SSL frameworks is standardized:
2
Alternative CPL designs for other modalities include percentile-based schedules (monotonically decreasing confidence cutoffs per round) (Cascante-Bonilla et al., 2020, Kim et al., 2023), density-based curricula for domain adaptation (Choi et al., 2019), SNR-based filtering for physiological regression (Wu et al., 6 Feb 2025), and temporally decaying weights for graph node classification (Zhang et al., 24 Apr 2025).
A representative summary of key mathematical components is given in the following table:
| Variant | Curriculum Variable | Selection/Weighting Function | Update Mechanism |
|---|---|---|---|
| FlexMatch (Zhang et al., 2021) | Per-class "effect" 3 | 4 | Class count, mapping |
| Curriculum Labeling (Cascante-Bonilla et al., 2020) | Confidence percentile | Top 5% of confidences | 6 per round |
| Tabular CPL (Kim et al., 2023) | Confidence & density regularized | 7 | Density + confidence |
| ElimPCL (Cheng et al., 31 Mar 2025) | Trustworthy set size 8 | Entropy/prototype-consistency filter | Epoch-wise expansion |
| Semi-rPPG (Wu et al., 6 Feb 2025) | Signal quality (SNR) ranking | Top 9 unlabeled samples | Linear ramp 0 |
| PTCL (Zhang et al., 24 Apr 2025) | Temporal distance from final label | 1 (exp. decay) | Step-wise update |
3. Adaptation to Diverse Data Modalities and Learning Problems
CPL was developed in the context of semi-supervised classification but its algorithmic pattern has been transferred to a wide spectrum of settings:
- Image Classification and SSL Benchmarks: The FlexMatch framework and its descendants boost both label efficiency and convergence speed on CIFAR-10, CIFAR-100, ImageNet-1K, STL-10, and SVHN, with up to 13.96% relative error reduction versus FixMatch and up to 5× faster convergence in low-label regimes (Zhang et al., 2021, Chen et al., 2023).
- Unsupervised and Source-Free Domain Adaptation: CPL methods rank or filter unlabeled target samples by density (Choi et al., 2019), prediction entropy/prototype consistency (Cheng et al., 31 Mar 2025), or other structure, to avoid noise accumulation and mitigate label drift across domains.
- Regression and Continuous Outputs: For tasks such as 3D rotation regression (pose), "hardness-aware" curricula use predictive entropy of the output distribution (e.g., matrix-Fisher on SO(3)) to schedule pseudo-label inclusion (Li et al., 23 Mar 2026).
- Multi-label and Multi-task Learning: In contextual and label-wise curricula, samples or label-entries are scheduled by intrinsic difficulty, such as label cardinality or confidence (Abdelfattah et al., 2022, Mekky et al., 12 Feb 2026).
- Graph and Temporal Learning: PTCL weighs pseudo-labels by temporal proximity to final "anchor" labels, using exponentially decaying weights along the temporal axis (Zhang et al., 24 Apr 2025).
- Reinforcement and Robotic Learning: Online grasp learning by CPL exploits per-pixel or per-action confidence schedules (e.g., success probability), ranging from global to spatially contextual (Le et al., 2024).
- Biomedical Applications: Semi-rPPG filters unlabeled facial video segments by heart rate signal quality, introducing higher-SNR pseudo-labels first in the curriculum (Wu et al., 6 Feb 2025).
- Tabular Learning: Regularization by feature density and the cluster assumption complements curriculum selection (Kim et al., 2023).
4. Theoretical Insights, Convergence, and Empirical Results
CPL methods are grounded in curriculum learning theory, in particular the hypothesis that easy-to-hard progression yields efficient optimization landscapes and mitigates confirmation bias. Though most CPL works do not supply formal convergence proofs, theoretical analyses assert that CPL regularizes pseudo-label selection, yields a balanced empirical risk, and tightens upper bounds on domain adaptation error via progressive trust in pseudo-labels (Zhang et al., 2021, Zhang et al., 2021).
Empirical gains are well-quantified:
- FlexMatch reduces error rates by 13.96% and 18.96% over FixMatch on CIFAR-100 and STL-10 (4 labels/class) (Zhang et al., 2021).
- CPL yields a 5.27% error on CIFAR-10 (4k labels) with ResNet-28-2, rivaling UDA and FixMatch (Cascante-Bonilla et al., 2020).
- On Office-31 and Office-Home, CPL enhances adaptation accuracy by up to 2.4% (Zhang et al., 2021, Choi et al., 2019).
- SNR-based CPL in Semi-rPPG reduces RMSE by 21.4% relative to supervised-only (Wu et al., 6 Feb 2025). Recent works further confirm that CPL frameworks generalize robustly to long-tail, low-label, and domain-shifted settings, providing marked improvements in convergence time, absolute accuracy metrics, and robustness to out-of-distribution distributions (Chen et al., 2023, Cheng et al., 31 Mar 2025).
5. Algorithmic and Practical Implementation Considerations
Most CPL instantiations are computationally efficient and introduce minimal architectural or computational overhead. Standard CPL requires maintaining per-class (or per-label, per-instance) counters or statistics, updating dynamic thresholds, and integrating a schedule (e.g., linear, convex, curriculum residual, or RL-discovered) (Zhang et al., 2021, Wang et al., 2022).
Best practices include:
- Normalizing per-class selection statistics to avoid bias;
- Using rank-based selection (e.g., by percentiles) to bypass sensitive threshold tuning (Cascante-Bonilla et al., 2020);
- Employing re-initialization or early stopping after pseudo-label integration to avoid confirmation bias and concept drift (Cascante-Bonilla et al., 2020, Kim et al., 2023);
- Combining with strong augmentations (e.g. MixUp, RandAugment, PoseMosaic) for improved regularization (Zhang et al., 2021, Li et al., 23 Mar 2026, Chen et al., 2023);
- Soft pseudo-label interpolation and regularization by density or entropy can further improve pseudo-label quality and downstream performance (Zhang et al., 2021, Kim et al., 2023);
- Adapting curriculum parameters (pace, shape, window size) to dataset scale and class imbalance.
Limitations remain: class-imbalanced or open-set scenarios can distort statistics used for threshold schedules; advanced or meta-learned curricula may be needed for real-world non-uniform distributions; and brute-force threshold-tuning is discouraged in favor of curriculum-driven or percentile-based regularization (Zhang et al., 2021, Kage et al., 2024).
6. Variants, Advanced Strategies, and Open Questions
CPL variants have been developed along several axes:
- Selection metrics: Confidence, local density, entropy, SNR, temporal proximity, label cardinality, class overlap, contextual or spatial metrics.
- Weighting schemes: Binary masking, entropy weighting, exponentially decayed weights, soft label interpolation (Abdelfattah et al., 2022).
- Curriculum progression: Linear, convex, adaptive, RL-discovered, dynamic per-class, or temporal.
- Integration points: Global (SSL), per-pixel (robotic grasping), per-timestep (dynamical graphs), across modalities (audio, video, tabular).
Active research topics include:
- Theoretical analysis of convergence and optimal curriculum pacing (Zhang et al., 2021, Kage et al., 2024);
- Performance-based or data-driven discovery of progression schedules (Wang et al., 2022);
- Robustness under severe class imbalance or highly non-stationary domains;
- Extension to regression, structured prediction, and other complex output spaces (Li et al., 23 Mar 2026, Wu et al., 6 Feb 2025);
- Meta-learning for schedule or selection metric optimization.
7. Notable Implementations and Representative Results (Selected Table)
| Domain | Algorithm (Citation) | Key CPL Mechanism | Relative Gain |
|---|---|---|---|
| Image SSL | FlexMatch (Zhang et al., 2021) | Per-class adaptive threshold | -14% error CIFAR-100 |
| Domain Adaptation | PCDA (Choi et al., 2019) | Density-based easy→hard curriculum | +2.7% acc. Office-Home |
| Multi-label | PLMCL (Abdelfattah et al., 2022) | Momentum, confidence-aware scheduler | +2–8% F1, various datasets |
| Regression (SO(3)) | HACMatch (Li et al., 23 Mar 2026) | Entropy-based, adaptive curriculum | -2° MeanMed PASCAL3D+ |
| Biomedical (rPPG) | Semi-rPPG (Wu et al., 6 Feb 2025) | SNR-based linear curriculum | -21.4% RMSE (VIPL-HR) |
| Graphs, dynamic node | PTCL (Zhang et al., 24 Apr 2025) | Temporal, exponentially decaying weights | +8–11% ACC/AUC |
| Tabular Data | CPL+R-CPL (Kim et al., 2023) | Percentile+likelihood regularization | +1–2% F1/acc; +29% low-data |
In summary, Curriculum Pseudo Labeling generalizes pseudo-labeling’s reliance on confidence-filtered selection by embedding the sample selection process in a curriculum learning paradigm. By doing so, CPL enables data- and class-adaptive leveraging of unlabeled data, mitigates detrimental error amplification, and drives improved generalization and convergence across domains (Zhang et al., 2021, Kage et al., 2024, Kim et al., 2023).
Referenced Works
- FlexMatch: "FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling" (Zhang et al., 2021)
- Review: "A Review of Pseudo-Labeling for Computer Vision" (Kage et al., 2024)
- Curriculum Labeling: "Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning" (Cascante-Bonilla et al., 2020)
- PCDA: "Pseudo-Labeling Curriculum for Unsupervised Domain Adaptation" (Choi et al., 2019)
- PLMCL: "PLMCL: Partial-Label Momentum Curriculum Learning for Multi-Label Image Classification" (Abdelfattah et al., 2022)
- HACMatch: "HACMatch Semi-Supervised Rotation Regression with Hardness-Aware Curriculum Pseudo Labeling" (Li et al., 23 Mar 2026)
- ElimPCL: "ElimPCL: Eliminating Noise Accumulation with Progressive Curriculum Labeling for Source-Free Domain Adaptation" (Cheng et al., 31 Mar 2025)
- Semi-rPPG: "Semi-rPPG: Semi-Supervised Remote Physiological Measurement with Curriculum Pseudo-Labeling" (Wu et al., 6 Feb 2025)
- PTCL: "PTCL: Pseudo-Label Temporal Curriculum Learning for Label-Limited Dynamic Graph" (Zhang et al., 24 Apr 2025)
- CPL for Tabular: "Revisiting Self-Training with Regularized Pseudo-Labeling for Tabular Data" (Kim et al., 2023)