SPLBoost: Robust Self-Paced Boosting

Updated 29 November 2025

SPLBoost is a robust boosting algorithm integrating self-paced learning into AdaBoost to automatically down-weight or discard noisy and outlying samples.
The method alternates between closed-form latent weight updates and weak learner optimization using various SP-regularizers, ensuring effective sample selection.
Empirical evaluations on synthetic and UCI datasets demonstrate that SPLBoost achieves lower test-error rates and enhanced robustness under noisy conditions.

SPLBoost is a robust boosting algorithm designed by integrating the self-paced learning (SPL) paradigm into the AdaBoost framework. The principal innovation lies in introducing a latent sample-weighting mechanism governed by a self-paced regularizer, which adaptively emphasizes easy samples and down-weights or discards potential outliers within each boosting round. This yields a saturated loss function, rendering SPLBoost highly insensitive to noise and extreme outlier contamination, and it is implemented via minimal modifications to standard boosting routines (Wang et al., 2017).

1. Formal Objective and SPL Regularization Schemes

Given binary-labeled data $\{(x_i, y_i)\}_{i=1}^n$ , $y_i \in \{\pm1\}$ , with a current strong classifier $F(x)$ , SPLBoost iteratively seeks to add a new weak learner $f(x)\in\{\pm1\}$ with coefficient $\alpha$ , while jointly optimizing latent sample weights $\mathbf{v} \in [0, 1]^n$ via a self-paced regularizer. The per-iteration objective is formulated as:

$\min_{\alpha,\;f,\;\mathbf v\in[0,1]^n} \sum_{i=1}^n \left[v_i\;\exp\left(-y_i\left(F(x_i)+\alpha f(x_i)\right)\right) + \hat{f}(v_i;\lambda)\right]$

Here, $v_i$ encodes the participation level of sample $i$ in the current round, and $\hat{f}(v;\lambda)$ is the SP-regularizer (also referred to as “age” regularizer) parametrized by $\lambda > 0$ . Three prevalent regularization choices are provided, all yielding closed-form solutions $v^*(\ell;\lambda)$ for each sample's loss $\ell$ :

Regularizer Type	Expression	$v^*(\ell;\lambda)$
Hard weighting	$\hat f(v;\lambda) = -\lambda v$	$1$ if $\ell < \lambda$ ; $0$ otherwise
Linear soft weighting	$\lambda(\frac{1}{2}v^2 - v)$	$\max\{0, 1 - \tfrac{\ell}{\lambda}\}$
Polynomial soft ( $t>1$ )	$\lambda(\tfrac{1}{t}v^t - v)$	$(1-\ell/\lambda)^{1/(t-1)}$ if $\ell<\lambda$ ; $0$ otherwise

By integrating $\hat f$ with the exponential loss, SPLBoost automates sample selection, systematically suppressing the influence of large-loss (likely noisy or outlying) samples.

2. Alternating Optimization Procedure

SPLBoost employs a block-coordinate descent within each boosting round, alternating between two update phases:

a) Majorization (Update $\mathbf v$ ):

With $(\alpha, f)$ fixed, update $v_i$ as:

$v_i^* = \arg\min_{v \in [0,1]} v \cdot \ell_i(\alpha, f) + \hat{f}(v; \lambda), \quad \ell_i(\alpha, f) = \exp(-y_i (F(x_i) + \alpha f(x_i)))$

Solutions for $v_i^*$ depend directly on the choice of $\hat f$ .

b) Minimization (Update $(f, \alpha)$ ):

With $\mathbf v$ fixed, minimize:

$\min_{\alpha, f} \sum_{i=1}^n v_i \exp(-y_i(F(x_i)+\alpha f(x_i)))$

Analogous to AdaBoost, $f_t$ is fit by minimizing the weighted squared-error proxy $\sum_i v_i w_i (y_i - f(x_i))^2$ with $w_i = \exp(-y_iF(x_i))$ . The optimal coefficient is:

$\alpha_t = \frac{1}{2} \ln \frac{\sum_{i: y_i = f_t(x_i)} v_i w_i}{\sum_{i: y_i \neq f_t(x_i)} v_i w_i}$

Sample weights are then updated for the next round:

$w_i \leftarrow w_i \exp(-\alpha_t y_i f_t(x_i)), \quad F(x) \leftarrow F(x) + \alpha_t f_t(x)$

Empirical observations suggest that often only one inner alternation suffices per round for effective optimization.

3. Algorithmic Description: Pseudocode

A concise pseudocode representation is as follows:

Input:  {(x_i, y_i)}_{i=1}^n, number of rounds T, SP-parameter λ
Initialize:  w_i ← 1/n,  v_i ← 1 for all i
F(x) ← 0

for t = 1…T do
    1) Compute v_i ← v*(ℓ_i; λ),  where ℓ_i = exp( -y_i·F(x_i) )
    2) Fit weak learner f_t: train on {(x_i, y_i)} with weight u_i = v_i·w_i
    3) Compute error err = ∑_{i: y_i ≠ f_t(x_i)} u_i
       α_t = ½·ln[(1−err)/err]
    4) Update strong model: F(x) ← F(x) + α_t f_t(x)
    5) Update AdaBoost weights: w_i ← w_i·exp(−α_t y_i f_t(x_i))
end for

Output: final classifier sign(F(x))

The mapping $v^*(\ell; \lambda)$ is determined analytically, based on the chosen SP-regularizer.

4. Theoretical Analysis and Guarantees

SPLBoost's procedure is rigorously characterized as a majorization–minimization (MM) algorithm on a latent nonconvex objective of the form:

$\sum_{i=1}^n \widetilde{F}_\lambda(\exp(-y_iF(x_i)))$

where

$\widetilde{F}_\lambda(\ell) = \int_0^\ell v^*(l; \lambda) dl$

is a saturated, nonconvex loss function. Each inner step involves constructing and minimizing a tight surrogate $Q(\alpha, f \mid \alpha^*, f^*)$ , ensuring monotonic decrease of the objective. The objective is bounded below; thus, the sequence $\{F_t\}$ converges to a local stationary point.

A key theoretical property is robustness: once the loss $\ell$ for a sample exceeds threshold $\lambda$ , $\widetilde{F}_\lambda(\ell)$ saturates and the sample's gradient vanishes, resulting in an automatic exclusion (via $v_i = 0$ ) of outliers and heavy-noise points.

5. Empirical Evaluation and Benchmarking

Experimental results are reported across synthetic and real-world benchmarks:

a) Synthetic 2D Gaussian Toy:

Constructed from two 2D Gaussians (100 samples each) with 15% random label flips.
Weak learners: C4.5 classification tree (AdaBoost/SPLBoost/RobustBoost), CART regression tree (LogitBoost/RBoost).
Alternatives compared: AdaBoost, LogitBoost, SavageBoost, RBoost, RobustBoost, SPLBoost (with $\lambda$ tuned).
SPLBoost is observed to assign zero weight to persistent misclassified (outlier) points and achieves a decision boundary near Bayes-optimal. In contrast, AdaBoost/LogitBoost overweight noisy points, and other nonconvex boosters still allocate weight to some outliers.

b) Seventeen UCI Datasets:

Dataset features vary (4–72), sizes range from 200 to 130,000 samples (e.g., adult, spambase, magic, miniboone).
Label noise injected at 0%, 5%, 10%, 20%, and 30%.
Standard splits: 70/30 train/test, 5-fold cross-validation for $\lambda$ and rounds (maximum 200), 50 repetitions.
SPLBoost exhibits uniformly lower test-error rates at all noise levels. Rank statistics over the 85 dataset/noise combinations position SPLBoost in the top tier for roughly 80% of cases, clearly outperforming convex and nonconvex boosting baselines.

c) Regularizer Ablation:

Four SPLBoost variants (hard, linear soft, polynomial soft with $t=1.3$ and $t=4$ ) yield similar robustness, indicating insensitivity to $\hat f$ choice, provided it enforces drop-out for large-loss samples.

6. Mechanisms Underlying SPLBoost Robustness

Instead of designing robust loss functions, SPLBoost leverages the SPL sample-selection to truncate the exponential loss beyond a fixed threshold $\lambda$ , yielding a saturated loss. The gradient vanishes for samples with large negative margins—outliers thus have no influence on weak learner fitting.

Alternating closed-form $v_i$ updates with standard AdaBoost steps circumvents nonconvex optimization difficulties (e.g., no need for Newton updates or differential equations as required in RobustBoost). The "age" parameter $\lambda$ modulates the algorithm stringency: smaller $\lambda$ increases outlier exclusion, larger $\lambda$ recovers AdaBoost as $\lambda\rightarrow\infty$ . Practical cross-validation typically chooses $\lambda$ within $[1, 6]$ and applies a brief warm-start to $\lambda$ to avoid excessive pruning in initial iterations.

SPLBoost thus combines automatic sample pruning with additive-model expertise from boosting to provide a scalable and robust classification approach (Wang et al., 2017).

PDF Markdown Chat (Pro)

References (1)

SPLBoost: An Improved Robust Boosting Algorithm Based on Self-paced Learning (2017)

SPLBoost: Robust Self-Paced Boosting

1. Formal Objective and SPL Regularization Schemes

2. Alternating Optimization Procedure

3. Algorithmic Description: Pseudocode

4. Theoretical Analysis and Guarantees

5. Empirical Evaluation and Benchmarking

6. Mechanisms Underlying SPLBoost Robustness

Whiteboard

Follow Topic

Continue Learning

SPLBoost: Robust Self-Paced Boosting

1. Formal Objective and SPL Regularization Schemes

2. Alternating Optimization Procedure

3. Algorithmic Description: Pseudocode

4. Theoretical Analysis and Guarantees

5. Empirical Evaluation and Benchmarking

6. Mechanisms Underlying SPLBoost Robustness

Sponsor

Whiteboard

Follow Topic

Continue Learning

Related Topics