CAMPUS: Competence-Aware Multi-Perspective Scheduling
- The paper introduces CAMPUS, an adaptive framework that unifies multiple difficulty metrics with dynamic competence estimation for improved convergence and task performance.
- It employs heterogeneous difficulty views—from lexical measures in LLMs to graph indices in GNNs—to tailor data exposure as the model’s abilities evolve.
- CAMPUS demonstrates significant empirical gains in accuracy, BLEU scores, and convergence speed across diverse domains including language, vision, and combinatorial scheduling.
Competence-Aware Multi-Perspective Curriculum Scheduling (CAMPUS) is a class of data-driven, adaptive scheduling frameworks that unify the notions of sample difficulty (“curriculum”) and learner competence to accelerate and optimize training in machine learning and optimization problems. Unlike conventional monotonic curricula or static difficulty orderings, CAMPUS incorporates multiple, heterogeneous views of difficulty and matches data presentation to evolving model abilities, typically resulting in improved convergence, stability, and task performance across diverse domains.
1. Foundational Principles and Definitions
CAMPUS generalizes classical curriculum learning by introducing two key innovations: (i) the simultaneous use of multiple, often orthogonal, difficulty metrics (“multi-perspective curricula”); (ii) explicit, model-driven estimation of competence to dynamically schedule which samples or curricula are presented at each training step. The framework has been independently proposed and validated in areas including deep neural network instruction tuning, graph neural networks, multilingual translation, visual concept acquisition, dialogue modeling, and combinatorial scheduling (Li et al., 17 Sep 2025, Vakil et al., 2023, Zhang et al., 2021, Li et al., 2020, Cai et al., 2020, Christou et al., 2022).
Let denote a dataset of size , and a collection of distinct scalar-valued difficulty metrics or functions, each mapping a data point (and possibly model parameters) to or . For each metric , the dataset is sorted (ascending or descending, depending on the learning objective) into . At each training iteration, the model’s current competence—either globally or per-task, per-view, or per-skill—is estimated according to explicit schedules, empirical loss, or Bayesian inference. The data scheduler then selects, from potentially all curricula, a subset or batch aligned with the current competence and learning goals.
2. Multi-Perspective Difficulty Metrics
CAMPUS frameworks extract and exploit a heterogeneous suite of difficulty views, each capturing fundamentally different aspects of sample complexity relevant to the task. Typical instantiations include:
- Natural Language Instruction Tuning: token length, MTLD (lexical diversity), batch-level cross-entropy, learned competence-aware scores (Li et al., 17 Sep 2025).
- Graph Neural Network Training: 26 graph-theoretic indices, e.g., degree, average neighbor degree, clustering coefficients, subgraph density, closeness/eigenvector/Katz centrality, connectivity, minimum matching (Vakil et al., 2023).
- Dialogue Generation Complexity: specificity (mean normalized IDF), repetitiveness, query-response relatedness (cosine similarity), continuity (semantic similarity to next human utterance), model confidence (negative cross-entropy) (Cai et al., 2020).
- Multilingual Machine Translation: language similarity, subword overlap, label-smoothed cross-entropy (Zhang et al., 2021).
- Visual Concept Learning: question-concept composition, concept confusion, question-level conjectured difficulty inferred via Item Response Theory (Li et al., 2020).
- Course Program Scheduling: course difficulty indices, pre/co-requisite structure, academic group membership, expected grade prediction (Christou et al., 2022).
These views are not combined into a single scalar; instead, each view defines an independent, ordered data stream (“sub-curriculum”).
3. Competence Estimation and Curriculum Matching
A central feature of CAMPUS is its dynamic matching of model exposure to current ability. Competence metrics may be:
- Global competence schedules: Power-law or exponential schedules parameterizing the fraction of data exposed at iteration , e.g.,
as in (Vakil et al., 2023, Li et al., 17 Sep 2025, Cai et al., 2020).
- Per-task/language/skill proficiency: Cross-entropy–to–likelihood conversion for task ,
where is the current loss and the converged bitext loss (Zhang et al., 2021).
- Bayesian/IRT-based learners: Posterior means estimated via mIRT (multi-dimensional Item Response Theory), modeling both concept competence () and concept difficulty () (Li et al., 2020).
- Learned surrogates: Auxiliary neural networks trained to classify or regress sample “easiness”/“hardness” according to fine-tuning progression and adversarial discrimination (Li et al., 17 Sep 2025).
Competence feeds directly into the selection mechanism, controlling either the scope of exposed data within each view, the progression among views, or the dynamic weighting of sample selection.
4. Scheduling Algorithms and Training Workflows
Scheduling algorithm structure in CAMPUS follows a modularized pattern:
- Initialization:
- Compute all difficulty metrics for the dataset; produce sorted lists (one per metric).
- Initialize all model and scoring network parameters.
- Iterative Scheduling (at each training step or epoch):
- For each view , determine fraction of to expose, typically as a monotonically increasing function of .
- Partition into consecutive sub-curricula .
- Selection:
- Estimate the current model’s competence or per-view PPL (perplexity), average loss, or explicit IRT parameters.
- From , select the view and sub-curriculum whose batch best matches the model’s current competence (e.g., via minimal PPL, maximal reward, or active learning criteria).
- Train on the selected batch; update models and (optionally) the difficulty ordering for competence-aware metrics.
- Increment the corresponding and update exposure scope accordingly.
- Resampling and Reweighting:
- For language/task curriculum learning: oversample tasks with lower current competence or adjust mixture weights dynamically (e.g., ) (Zhang et al., 2021).
- For multi-view learning: allow “soft” view selection by assigning weights or restrict to hard, single-view selection (Vakil et al., 2023).
Illustrative pseudocode and mathematical kernels for these schedules are detailed in (Li et al., 17 Sep 2025, Vakil et al., 2023, Zhang et al., 2021, Cai et al., 2020).
5. Representative Domains and Empirical Results
CAMPUS has been prototyped and evaluated in several domains, each exploiting domain-specific difficulty measures and competence models:
- LLM Instruction Tuning: CAMPUS yields superior overall accuracy in arithmetic, code, and dialogue domains on LLaMA-7B and 13B, outperforming baselines such as random shuffle, Tree-Instruct, and Conifer by ~7% average across GSM8K, HumanEval, and MT-Bench (Li et al., 17 Sep 2025).
- Graph Neural Network Training: Incorporating multi-view graph complexity and competence schedules produces F1 or accuracy improvements of 3–7 percentage points over competing methods in bioinformatics link prediction (PGR, GDPR) and citation node classification (Cora, OGBN-Arxiv) (Vakil et al., 2023).
- Multilingual Machine Translation: Dynamic scheduling via self- and HRL-evaluated competence leads to 1–2 BLEU increases and closes the high-/low-resource gap without sacrificing overall model quality (Zhang et al., 2021).
- Dialogue Generation: Adaptive multi-curricula scheduling consistently boosts all 13 tested metrics, with increases of 40–100% in BLEU and “Distinct” as well as superior human evaluation rates (Cai et al., 2020).
- Visual Reasoning and VQA: Variational mIRT-guided curricula reduce question sampling by ∼60%, triple convergence speed, and reach >99% single-concept accuracy on CLEVR (Li et al., 2020).
- Academic Course Scheduling: Mixed-integer linear programming (MILP) based CAMPUS achieves optimal schedules in under 10 seconds, automating what formerly required hours of manual advisor planning and substantially improving the optimality of course sequencing under e.g., GPA maximization, graduation speed, and difficulty balancing (Christou et al., 2022).
| Domain | Difficulty metrics | Competence estimation | SOTA improvement |
|---|---|---|---|
| LLM finetuning | Length, MTLD, loss, R(x) | PPL, learned scorer | +7% accuracy avg. |
| GNNs | 26 graph indices | Coverage, batch loss | +3–7pp F1/Acc |
| MultiMT | Resource overlap, loss | 2, HRL-agg. | +1–2 BLEU |
| Dialogue modeling | SPEC, REPT, QR, CONT, CONF | RL reward, loss, steps | 40–100% on BLEU, Distinct |
| Visual QA | mIRT, question diff. | Bayesian mIRT θ, b | ×3 convergence |
| Course planning | Prereqs, term, diff., GPA | ML GPA predictor | Automated, optimal in <10s |
6. Implementation Considerations and Scalability
CAMPUS frameworks are modular and compatible with most neural and combinatorial optimization pipelines. Notable details include:
- Scalability: Complexity is governed by the need to (i) precompute and sort difficulty views (), (ii) dynamically evaluate model loss or PPL on candidate sub-batches per iteration, and (iii) optionally re-sort competence-aware lists every steps (Li et al., 17 Sep 2025, Vakil et al., 2023).
- Hyperparameters: Competence schedule (), batch sizes, sorting frequency, and policy learning rates are primary tunables across domains.
- Integration: CAMPUS schedules subsume and interoperate with filtering-based selection schemes (e.g., instance-based filtering, DEITA, IFD) (Li et al., 17 Sep 2025).
- Overhead: Overhead for view selection and resampling is negligible relative to training, provided is moderate. For GNNs, training with CAMPUS incurs a 1.5–2 speedup over naive curricula owing to restricted batch exposure in early phases (Vakil et al., 2023).
7. Theoretical and Practical Impact
CAMPUS provides a principled approach for adaptive curriculum learning that avoids both the rigidity of static difficulty orderings and the inefficiency of “one-size-fits-all” data presentation. By aligning model training with its evolving “zone of proximal development,” CAMPUS frameworks yield marked efficiency gains, higher asymptotic task performance, and empirically robust generalization across language, vision, optimization, and graph domains (Li et al., 17 Sep 2025, Vakil et al., 2023, Cai et al., 2020, Zhang et al., 2021, Christou et al., 2022). A plausible implication is the eventual unification of data-centric and model-centric approaches to sample scheduling in large-scale machine learning, with CAMPUS-style adaptive multi-view frameworks serving as a blueprint for future curriculum design.