RocketStack: Deep Recursive Ensemble

Updated 6 December 2025

RocketStack is a deep recursive ensemble learning framework that extends stacking architectures with up to ten layers, integrating predictions through adaptive model pruning and feature compression.
The framework leverages recursive stacking, noise-perturbed pruning, and periodic feature compression methods (such as SFE, autoencoder, and attention-based selection) to control computational and feature complexity.
Empirical results on binary and multi-class datasets show significant accuracy improvements (up to 6.11% gains) and substantial runtime and feature dimensionality reductions compared to traditional stacking methods.

RocketStack is a level-aware deep recursive ensemble learning framework designed to extend the depth of stacking architectures while controlling computational and feature complexity through adaptive model pruning, feature compression, and stochastic regularization. The methodology systematically advances beyond conventional horizontal diversity in ensemble learning by enabling recursive stacking up to ten levels, thus promoting deeper representational integration across base learners with tractable computational costs (Demirel, 20 Jun 2025).

1. Recursive Stacking Architecture

RocketStack generalizes traditional stacking by constructing a hierarchy of ensemble layers, each integrating predictions from the preceding level through meta-feature concatenation and selective pruning. Let $X^{(0)} \in \mathbb{R}^{n \times d}$ represent the original $n$ -sample, $d$ -feature training set, with $X_{\rm te}^{(0)}$ as its hold-out counterpart. At stacking level $\ell$ , the ensemble consists of $\mathcal{M}^{(\ell-1)} = \{ m_1, ..., m_{M_{\ell-1}} \}$ , where $M_{\ell-1}$ is the number of models retained post-pruning from the previous level.

Each model undergoes $K$ -fold cross-validation; its concatenated out-of-fold (OOF) predictions form $p_i^{(\ell)} \in \mathbb{R}^n$ . Aggregating predictions across all models yields $P^{(\ell)} \in \mathbb{R}^{n \times M_{\ell-1}}$ . The iterative meta-feature expansion is defined as:

$X^{(\ell)} = [X^{(\ell-1)} \mid P^{(\ell)}],\quad X^{(\ell)}_{\rm te} = [X^{(\ell-1)}_{\rm te} \mid m(X^{(\ell-1)}_{\rm te})]$

where $\mid$ denotes column-concatenation and $m(\cdot)$ the vector of hold-out predictions.

Model pruning is performed at each level to ensure $M_\ell < M_{\ell-1}$ . Raw OOF performance scores $a_i^{(\ell)}$ (accuracy or AUC) are computed for each $m_i$ , and a custom threshold $Q^{(\ell)}_{\rm custom}$ is defined as the quantile at $\gamma = 5 + 80[\mathrm{std}(\tilde{\mathbf a}^{(\ell)})]^2$ , where $\tilde a_i^{(\ell)}$ may be the raw or noise-perturbed score. Only models meeting $\tilde a_i^{(\ell)} \ge Q^{(\ell)}_{\rm custom}$ are retained.

2. Pruning Strategies and Feature Compression Mechanisms

A key innovation in RocketStack is the introduction of noise-perturbed pruning and adaptive feature compression, implemented as follows:

Noise-perturbed Pruning

Mild Gaussian noise is added to OOF scores prior to pruning to serve as a regularizer:

$\tilde a_i^{(\ell)} = a_i^{(\ell)} + \epsilon_i,\quad \epsilon_i \sim \mathcal{N}(0,\,\lambda \cdot \mathrm{range}(\mathbf a^{(\ell)}))$

where $\lambda \in \{0, 0.05, 0.10\}$ . Strict ( $\lambda=0$ ) and randomized ( $\lambda>0$ ) schemes are compared.

Feature Compression

Feature dimensionality is controlled either at every level or periodically (e.g., levels 3, 6, 9), using one of three compressors:

Simple, Fast, Efficient (SFE) Filter: Utility $\mathcal{U}(f) = \mathrm{Rel}(f)/(1+\mathrm{Red}(f))$ ; select features with highest utility.
Autoencoder (AE) Compression: Nonlinear reduction using $f_\theta, g_\phi:$ minimize $\|X - \hat X\|^2$ with bottleneck $k \approx d/3$ .
Attention-Based Selection: Compute $\alpha = \mathrm{softmax}(W X + b)$ ; keep $x_i$ with $\alpha_i \ge Q_{75}(\alpha)$ .

A simplified pseudocode of the framework orchestrates the OOF generation, optional feature compression, model evaluation, noise injection, and dynamic pruning per level, with user-specified settings for stacking depth $L$ , cross-validation folds $K$ , pruning noise $\lambda$ , compression mode, compressor type, periodicity, and minimum model count.

3. Model Training, Meta-Learner Pooling, and Computational Complexity

At each level, retained base learners are re-trained on the augmented feature matrix $X^{(\ell)}$ , recursively constructing deeper meta-representations. Rather than a single fixed meta-learner, the ensemble at each level comprises all surviving models $\mathcal{M}^{(\ell)}$ , with optional selection of the top- $k$ or the singular top performer for inference.

The computational complexity of each level is dominated by $O(K M_{\ell-1} T_{\text{train}}(d_{\ell-1}))$ for cross-validated training, $O(d_{\ell-1} n)$ for filter-based compression (or $O(d_{\ell-1} n h)$ for autoencoders), and $O(M_{\ell-1} \log M_{\ell-1})$ for pruning. Sublinear runtime growth with increasing $\ell$ is achieved through aggressive pruning and feature reduction, supporting practical exploration to depths of $\ell = 10$ .

4. Empirical Evaluation across Binary and Multi-Class Datasets

Experiments across 33 OpenML datasets (23 binary, 10 multi-class) demonstrate the efficacy and scalability of RocketStack:

Binary Classification (Periodic SFE at Levels 3/6/9)

Strict pruning ( $\lambda=0$ ): 88.08% accuracy at level 10
Light randomization ( $\lambda=0.05$ ): 88.40% (+0.32%)
Runtime reduction: $\sim$ 10.5% compared to no compression
Feature count at L10: $\sim$ 6 (vs. 177 with no compression)

Multi-Class Classification (Periodic Attention)

Strict pruning: 93.29%
Light randomization: 93.67% (+0.38%)
Ultimate accuracy at L10: 98.60% (vs. 92.49% best baseline; +6.11%)
Runtime reduction: $-$ 56.1% relative to no compression
Feature reduction at L10: From $\sim$ 145 to $\sim$ 38 ( $-$ 74%)

Linear mixed model analysis indicates significant accuracy increases with stacking depth in most configurations ( $p < .001$ ). Periodic compression schemes yield the strongest trends ( $p < .01$ ), while per-level compression often lacks a significant trend ( $p > .05$ ).

5. Staged Ensemble Dynamics: The Rocket Analogy

RocketStack is conceptualized around the metaphor of multistage rocket engineering, encapsulated as “Prune – Compress – Propel”:

Prune: Analogous to jettisoning empty fuel tanks, underperforming learners are removed to prevent superfluous complexity.
Compress: Periodic feature compression parallels stage separation, allowing informative meta-features to accumulate before redundancy is discarded.
Propel: Mild Gaussian randomization in pruning induces a controlled instability, analogous to guidance feedback in rocket dynamics, promoting diversity and mitigating the risk of premature convergence.

These coordinated mechanisms facilitate deep recursive ensembling with sustainable complexity, enabling superior predictive performance relative to shallower and horizontally-diverse stacking architectures.

6. Significance and Implementation Considerations

RocketStack establishes a scalable paradigm for deep ensemble integration, demonstrating that controlled regularization and staged dimensionality reduction can overcome saturation and complexity barriers that previously limited the depth of stack-based learning. Its modular design accommodates advances in feature compression, meta-learner architectures, and adaptive pruning for continued empirical and theoretical exploration (Demirel, 20 Jun 2025). The detailed pseudocode and equation definitions provided in the original manuscript enable rigorous reimplementation and comparative benchmarking.

PDF Markdown Chat (Pro)

References (1)

RocketStack: Level-aware deep recursive ensemble learning framework with adaptive feature fusion and model pruning dynamics (2025)

RocketStack: Deep Recursive Ensemble

1. Recursive Stacking Architecture

2. Pruning Strategies and Feature Compression Mechanisms

Noise-perturbed Pruning

Feature Compression

3. Model Training, Meta-Learner Pooling, and Computational Complexity

4. Empirical Evaluation across Binary and Multi-Class Datasets

Binary Classification (Periodic SFE at Levels 3/6/9)

Multi-Class Classification (Periodic Attention)

5. Staged Ensemble Dynamics: The Rocket Analogy

6. Significance and Implementation Considerations

Whiteboard

Follow Topic

Continue Learning

RocketStack: Deep Recursive Ensemble

1. Recursive Stacking Architecture

2. Pruning Strategies and Feature Compression Mechanisms

Noise-perturbed Pruning

Feature Compression

3. Model Training, Meta-Learner Pooling, and Computational Complexity

4. Empirical Evaluation across Binary and Multi-Class Datasets

Binary Classification (Periodic SFE at Levels 3/6/9)

Multi-Class Classification (Periodic Attention)

5. Staged Ensemble Dynamics: The Rocket Analogy

6. Significance and Implementation Considerations

Sponsor

Whiteboard

Follow Topic

Continue Learning

Related Topics