Papers
Topics
Authors
Recent
2000 character limit reached

PRBoost: Probing & Prompt-Based Boosting

Updated 9 December 2025
  • PRBoost is a dual-method approach encompassing both a probing-based variable selection algorithm and a prompt-based rule discovery system for weakly-supervised learning.
  • The probing-based method uses shadow variables to detect noise, efficiently halting boosting iterations while achieving competitive true positive and false discovery rates.
  • The prompt-based method leverages masked language models and human-vetted rules to iteratively improve NLP task performance with measurable gains in accuracy and F1 scores.

PRBoost refers to two distinct methodologies found in the machine learning literature: (1) a probing-based variable selection approach for model-based boosting in statistical modeling, and (2) a prompt-based rule discovery and boosting system for interactive weakly-supervised learning in natural language processing. Both share the PRBoost designation but apply fundamentally different mechanisms to address variable selection and weak label quality, respectively. Coverage is provided for both systems as defined in Thomas et al. (2017) (Thomas et al., 2017) and Zhang et al. (2022) (Zhang et al., 2022).

1. Probing-Based PRBoost for Sparse Variable Selection

PRBoost, as introduced by Thomas et al. (2017), is a single-pass, probing-based algorithm for variable selection within component-wise model-based boosting. The key innovation lies in augmenting the feature set with so-called shadow variables—random permutations of each original variable—which serve as proxies representing noise features. During the sequential model-fitting process, selection of a shadow variable signals that subsequent variables are indistinguishable from noise, and the procedure terminates, returning the originally selected variables as the informative set (Thomas et al., 2017).

2. Formal Algorithmic Description (Probing PRBoost)

The augmented variable matrix Xˉ\bar{X} is constructed by concatenating the original XX with permuted (shadow) versions. The boosting process proceeds as follows:

  • Initialization:
    • f^[0](x)=argminci=1nρ(y(i),c)\hat{f}^{[0]}(x) = \arg\min_c \sum_{i=1}^n \rho(y^{(i)}, c)
    • m0m \gets 0
  • Iteration:
    • Compute negative gradients: u(i)=fρ(y(i),f)f=f^[m1](x(i))u^{(i)} = -\frac{\partial}{\partial f} \rho(y^{(i)}, f)\big|_{f = \hat{f}^{[m-1]}(x^{(i)})}
    • For all j=1,,2pj=1,\ldots,2p, fit base learner hjh_j and compute residual error.
    • Identify j=argminji=1n[u(i)h^j[m](xj(i))]2j^* = \arg\min_j \sum_{i=1}^n [u^{(i)} - \hat{h}_j^{[m]}(x_j^{(i)})]^2.
    • If jj^* is a shadow variable, terminate. Else, update:
    • f^[m](x)=f^[m1](x)+νh^j[m](xj)\hat{f}^{[m]}(x) = \hat{f}^{[m-1]}(x) + \nu \cdot \hat{h}_{j^*}^{[m]}(x_{j^*}).
  • Output: Return all original (non-shadow) variables selected before stopping.

No hyperparameters other than the learning rate ν\nu are required. No cross-validation or bootstrap resampling is used to choose the stopping point.

3. Empirical Properties and Comparative Evaluation

Across $12$ high-dimensional scenarios (p{100,500,1000}p\in\{100,500,1000\}, n{100,500}n\in\{100,500\}), PRBoost was benchmarked against cross-validation (CV)–tuned boosting and stability selection (SS). True positive rate (TPR) and false discovery rate (FDR) were measured. PRBoost achieved a median TPR of $0.60$ and FDR of $0.20$ in high-dimensional settings, providing a trade-off intermediate between CV and SS:

Method median TPR median FDR
CV (25-boot) 0.75 0.50
SS (PFER=2.5) 0.40 0.10
PRBoost 0.60 0.20

Computation time was substantially lower for PRBoost ($0.8$s per replicate) than for CV ($12$s) or SS ($60$s). PRBoost delivered substantially improved FDR over CV and similar selection stability to SS (Thomas et al., 2017).

4. Implementation and Practical Notes (Probing PRBoost)

PRBoost is implemented in the R package mboost via the argument probe=TRUE. The complexity per boosting iteration is O(p)O(p). Default learning rate ν=0.1\nu=0.1 is generally sufficient, and the only randomness arises from shadow variable generation. No parameter tuning for stopping, error thresholds, or regularization is necessary. In practice, the method has shown effectiveness in gene expression settings (e.g., selecting $10$ out of $4088$ genes in under $1$s for riboflavin production estimation), with strong sparsity and selection stability (Thomas et al., 2017).

5. Prompt-Based PRBoost for Interactive WSL

The prompt-based PRBoost system, as defined by Zhang et al. (2022), is an interactive weakly-supervised learning (WSL) framework for NLP tasks. It combines boosting-style instance weight updating with prompt-based rule induction from pre-trained masked LMs. The workflow iteratively (a) identifies highest-error instances under the current WSL model, (b) generates prompt-based rule candidates using LMs, (c) has human annotators select among these candidates, (d) matches validated rules to new unlabeled instances for weak labeling, and (e) retrains or fine-tunes the WSL model, incorporating self-training for uncovered instances (Zhang et al., 2022).

6. Algorithmic Structure and Mathematical Formulation (Prompt-Based PRBoost)

  • Initialization: Prepare sets Du\mathcal{D}_u (unlabeled), Dl\mathcal{D}_l' (weakly labeled via initial rules), Dl\mathcal{D}_l (small clean set); set rule set R\mathcal{R}; sample weights wi(1)=1/Dlw_i^{(1)} = 1/|\mathcal{D}_l|.
  • Boosting Step: At round tt, compute ensemble error:

errt1=iDlwi(t)I[yift1(xi)]iwi(t)\text{err}_{t-1} = \frac{\sum_{i\in\mathcal{D}_l} w_i^{(t)}\cdot \mathbb{I}[y_i\neq f_{t-1}(x_i)]}{\sum_{i} w_i^{(t)}}

and update instance weights:

wi(t+1)=wi(t)exp[αt1I[yift1(xi)]]w_i^{(t+1)} = w_i^{(t)} \cdot \exp[\alpha_{t-1}\cdot \mathbb{I}[y_i\neq f_{t-1}(x_i)]]

where αt1=log(1errt1errt1)+log(K1)\alpha_{t-1} = \log\left(\frac{1-\text{err}_{t-1}}{\text{err}_{t-1}}\right) + \log(K-1).

  • Rule Discovery: Prompt top-weighted hard examples through a template τ()\tau(\cdot) and masked LM M\mathcal{M}, creating candidate rules from top-kk LM predictions.
  • Human Annotation: Human experts vet up to BB candidates per round, producing the accepted set Rt+\mathcal{R}_t^+.
  • Rule Matching and Weak Label Creation: For accepted rule rjr_j, assign a label to instances xx using a similarity score sj=βsja+(1β)sjbs_j = \beta s_j^a + (1-\beta)s_j^b (embedding and vocabulary overlap), with thresholding.
  • Model Training and Self-Training: Expand the weak labeled set, retrain a weak WSL model minimizing cross-entropy, and self-train on unmatched data via label sharpening and KL loss.
  • Ensembling: Final prediction is unweighted average: f(x)=1Tt=1Tmt(x)f(x) = \frac{1}{T} \sum_{t=1}^T m_t(x).

7. Experimental Evaluation and Ablation (Prompt-Based PRBoost)

Evaluation covered four NLP tasks: TACRED, DBPedia, ChemProt, and AG News. Experiments used 10 rounds, 10 hardest examples per round, and a human annotation budget of 100 rules per round. PRBoost outperformed state-of-the-art WSL baselines (e.g., Snorkel, LOTClass, COSINE), with relative improvements of up to +8.4%+8.4\% on TACRED (F1_1), +7.2%+7.2\% on DBPedia, and +7.3%+7.3\% on ChemProt. After fine-tuning, PRBoost maintained a +2.4%+2.4\% average gain. For three datasets, it bridged the gap from 18%\sim18\% to within $5$–7%7\% of full supervision. Annotation efficiency was high (3s/rule), and ablations demonstrated the critical contributions of interactive rule composition and self-training (Zhang et al., 2022).

Dataset Unsupervised WSL Baseline PRBoost (Metric) Relative Gain
TACRED +8.4% (F1_1) Substantial
DBPedia +7.2% (Acc.) Substantial
ChemProt +7.3% (Acc.) Substantial
AG News +2.4% (Acc.) Modest

8. Limitations and Prospects for Extension

Probing-based PRBoost depends only on the randomness of shadow variable generation and does not require further hyperparameter tuning or resampling. It may be less directly aligned with predictive accuracy compared to methods optimizing that metric, but achieves strong sparsity and stability.

Prompt-based PRBoost relies on human judgment to validate rules, requiring the design of task-specific prompt templates. Its effectiveness depends on the ability of LMs to yield informative rule candidates and on efficient human vetting. Prospective improvements include automated rule-quality prediction, prompt template optimization, domain-specific annotation strategies, and extension beyond classification to structured output tasks (Zhang et al., 2022).

References

  • "Probing for sparse and fast variable selection with model-based boosting" (Thomas et al., 2017)
  • "PRBoost: Prompt-Based Rule Discovery and Boosting for Interactive Weakly-Supervised Learning" (Zhang et al., 2022)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to PRBoost.