Sparse Broad Learning System (S-BLS)

Updated 29 November 2025

S-BLS is a model framework that improves robustness and interpretability of Broad Learning Systems by integrating Sequential Threshold Least-Squares.
It employs a sparse output weight learning mechanism to prune noisy nodes while maintaining predictive performance.
Experimental evaluations show a 20–35% RMSE reduction and significant node pruning in both nonlinear system identification and CSTR benchmarks.

The Sparse Broad Learning System (S-BLS) is a model framework developed to enhance the robustness and interpretability of Broad Learning Systems in the presence of sensor noise and outliers, which are common in industrial and nonlinear system identification tasks. S-BLS integrates a Sequential Threshold Least-Squares (STLS) mechanism for learning sparse output weights, offering significant improvements in generalization, efficiency, and noise resilience compared to standard ridge-regression-based BLS solutions (Li, 22 Nov 2025).

1. Structure of the Broad Learning System

The conventional Broad Learning System (BLS) is a shallow network architecture designed for efficiently mapping input data to target outputs with reduced computational complexity. For input data $X \in \mathbb{R}^{N \times D}$ , BLS constructs two primary layers before linear output regression:

Feature Mapping Layer: Input samples are transformed via random weights and nonlinearity $\phi$ into $n$ feature groups, each of size $k$ ,

$Z_i = \phi(X W_{e_i} + \beta_{e_i}) \in \mathbb{R}^{N \times k},\quad i = 1, \dots, n,$

which are concatenated to form $Z^n = [Z_1, ..., Z_n] \in \mathbb{R}^{N \times (nk)}$ .

Enhancement Layer: The mapped features $Z^n$ are further expanded via $m$ enhancement groups (size $q$ ) with another nonlinearity $\xi$ ,

$H_j = \xi(Z^n W_{h_j} + \beta_{h_j}) \in \mathbb{R}^{N \times q},\quad j = 1, ..., m,$

and concatenated to yield $H^m \in \mathbb{R}^{N \times (mq)}$ .

The system input for regression, $A = [Z^n \mid H^m] \in \mathbb{R}^{N \times L}$ , with $L = nk + mq$ , feeds a linear read-out $Y = A W_{\text{out}}$ to match the targets $Y \in \mathbb{R}^{N \times C}$ .

Standard BLS computes $W_{\text{out}}$ by ridge-regularized least squares (pseudoinverse), leading to a solution that is typically dense, with each node contributing to the output, thus propagating noise and risking overfitting in non-ideal measurement environments.

2. Sparse Output-Weight Learning via Sequential Threshold Least-Squares

S-BLS modifies the output-weight learning by introducing direct sparsity in $W_{\text{out}}$ using an $L_0$ -penalized regression objective,

$\min_{W}\;\frac12\,\|Y - A\,W\|_F^2 + \alpha\,\|W\|_0.$

Since obtaining the exact $L_0$ solution is NP-hard, S-BLS adopts the Sequential Threshold Least-Squares (STLS) strategy, which alternates between elementwise hard-thresholding and least-squares projection on active variables.

STLS Steps:

Hard Thresholding: Zero out weights $|w| < \lambda$ via

$\mathcal{T}_{\lambda}(w) = \begin{cases} 0, & |w| < \lambda, \ w, & |w| \geq \lambda. \end{cases}$

Least-Squares Projection: Restrict $A$ to active columns $\mathcal{S}$ where $W_{i,d} \neq 0$ . Compute restricted least squares,

$W_{\mathcal{S}} = (A_{\mathcal{S}}^{T}A_{\mathcal{S}})^{-1}A_{\mathcal{S}}^{T}Y,$

with all other weights fixed at zero.

Iteration: Start with $W^{(0)} = A^\dagger Y$ and apply thresholding and projection for a fixed number $T$ (typically $5$–$10$) of iterations.

Following this procedure results in a sparse $W_{\text{out}}$ that selectively removes noise-influenced nodes while maintaining predictive performance.

3. Training Procedure and Implementation

The S-BLS training process follows these steps:

Random Layer Construction: Sample $W_{e_i}, \beta_{e_i}, W_{h_j}, \beta_{h_j}$ for all feature and enhancement nodes.
Layer Computation: Compute all $Z_i$ , $H_j$ , and concatenate to form $A$ .
Initialization: Compute $W^{(0)} = A^\dagger Y$ .
Iterative STLS: For $t = 1,\dots,T$ $t = 1, \dots, T$ ,
- Hard-threshold $W^{(t-1)}$ ,
- For each output $d$ , re-solve least squares on active set $\mathcal{S}_d$ .
Termination: Return the final sparse output matrix $W^{(T)}$ .

This approach preserves the efficient analytical nature of classical BLS while achieving node-level sparsity that benefits downstream interpretability.

4. Experimental Evaluation

S-BLS has been extensively evaluated in two settings:

4.1 Nonlinear System Identification

Model: A nonlinear difference equation,

$y(n) = \frac{y(n-1)\,y(n-2)\,(y(n-1)+2.5)}{1 + y^2(n-1)+y^2(n-2)}+u(n-1)$

with $u(n)$ stochastic in training and sinusoidal in test.

Noise: Gaussian/uniform, with levels $\gamma = 0.1$ to $0.4$.
Metrics: Test RMSE and sparsity ratio (active/L).

Noise	RMSE (BLS)	RMSE (S-BLS)	Active/L
0.1	0.3370	0.2588	201/401 (50.1%)
0.2	0.2222	0.1673	201/401
0.3	0.2361	0.1648	201/401
0.4	0.2528	0.1632	201/401

S-BLS achieves 20–35 % RMSE reduction and prunes roughly 50 % of nodes.

4.2 CSTR Benchmark

System: Nonlinear continuous stirred tank reactor (CSTR) model under noisy, outlier-corrupted conditions.
Samples: 2,000.
Noise levels: $\gamma = 0.2,0.3,0.4$ .

Noise	RMSE (BLS)	RMSE (S-BLS)	Active/201
0.2	0.0644	0.0590	61 (30%)
0.3	0.0886	0.0809	61
0.4	0.1111	0.1031	61

S-BLS consistently achieves lower RMSE and prunes approximately 70% of nodes.

5. Computational Complexity and Comparison

Let $N$ denote the number of samples, $L$ the number of initial nodes, $L_a$ the number remaining after pruning.

Standard BLS: One pseudoinverse, $\mathcal{O}(NL^2 + L^3)$ .
Lasso-BLS (e.g. ISTA/ADMM): Per iteration $\mathcal{O}(NL)$ , but typically requires many iterations.
S-BLS: Initialization is the same as standard BLS. Each STLS iteration (fixed $T \approx 10$ $T \approx 10$ ):
- Thresholding: $\mathcal{O}(LC)$ , negligible.
- Least squares on $L_a$ columns: $\mathcal{O}(NL_a^2 + L_a^3)$ .
- Total: $\mathcal{O}(NL^2 + L^3) + T\mathcal{O}(NL_a^2 + L_a^3)$ .

Because $L_a \ll L$ after a few steps, the added cost is marginal relative to standard BLS and much lower than Lasso-based solutions. The sparsity also yields faster inference and improved generalization.

6. Practical Considerations and Applicability

Threshold Selection ( $\lambda$ ): Should be aligned with the standard deviation of the noise. Cross-validation on the training set is recommended. Overly small $\lambda$ retains noise; overly large values remove informative nodes.
Regularization Parameter ( $\alpha$ ): Linked to $\lambda$ (via KKT conditions), but S-BLS optimizes $\lambda$ directly.
Iterations ( $T$ ): Empirically, $T=5$ –$10$ suffices, as the active support stabilizes rapidly.
Parameterization of Nodes ( $n,k,m,q$ ): Moderate over-parameterization is beneficial; initial redundancy is pruned by STLS.
Robustness: S-BLS employs hard rather than soft thresholding (as in Lasso), making it particularly adept at rejecting noise-dominated components and handling outlier or non-Gaussian corruption.
Domains of Application: S-BLS is well-suited for system identification, control-oriented modeling, and any scenario prioritizing interpretability and computational speed.
Limitations: No global $L_0$ -optimality guarantee. Performance relies on a quality initial overcomplete basis and careful hyperparameter selection.

In summary, S-BLS achieves a balance between analytical efficiency, model sparsity, and robustness to noise by integrating the STLS algorithm into the BLS framework, offering superior performance in both synthetic and real-world system identification benchmarks, without substantial computational overhead relative to standard BLS (Li, 22 Nov 2025).

Markdown Report Issue Upgrade to Chat

References (1)

Sparse Broad Learning System via Sequential Threshold Least-Squares for Nonlinear System Identification under Noise (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Sparse Broad Learning System (S-BLS).

Sparse Broad Learning System (S-BLS)

1. Structure of the Broad Learning System

2. Sparse Output-Weight Learning via Sequential Threshold Least-Squares

3. Training Procedure and Implementation

4. Experimental Evaluation

5. Computational Complexity and Comparison

6. Practical Considerations and Applicability

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Sparse Broad Learning System (S-BLS)

1. Structure of the Broad Learning System

2. Sparse Output-Weight Learning via Sequential Threshold Least-Squares

3. Training Procedure and Implementation

4. Experimental Evaluation

5. Computational Complexity and Comparison

6. Practical Considerations and Applicability

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research