MiniRocket: Fast Time Series Classification
- MiniRocket is an efficient time series feature extraction method using fixed, sparse convolutional kernels, log-spaced dilations, and near-deterministic bias selection.
- It simplifies the design of random convolutional transforms by replacing varied pooling operators with a single PPV statistic, greatly reducing computational cost.
- Empirical results validate its high accuracy and scalability across diverse applications such as healthcare, industrial monitoring, and real-time systems.
MiniRocket is an almost-deterministic, extremely efficient time series feature extraction methodology based on random convolutional kernels, designed for scalable, high-accuracy time series classification. Introduced as a refinement of the ROCKET transform, MiniRocket achieves state-of-the-art accuracy with orders-of-magnitude lower computational cost, feature dimensionality, and parameter tuning requirements relative to deep learning and classical feature engineering approaches. Its core innovation is the replacement of large, fully random kernel sets and multiple pooling operators with a constrained, fixed set of sparse kernels, a single pooling statistic (proportion of positive values, PPV), and near-deterministic bias selection, while preserving universality and discriminative power for a broad range of univariate and multivariate sequences.
1. Core Algorithmic Principles and Mathematical Formulation
MiniRocket operates by convolving input time series with a bank of fixed kernels sampled from a small discrete set. Let be a univariate time series. The main steps are:
a) Convolutional Feature Generation:
- Kernel design: A fixed set of base kernels of length 9, each with exactly three +2 and six –1 entries.
- Dilations: For each kernel, apply a set of log-uniformly spaced dilations , sampled via .
- Bias selection: Instead of random sampling, the bias for each kernel–dilation pair is set to the median of the convolution output over a small fixed subset of training instances.
Mathematically, for each kernel–dilation pair:
b) PPV Pooling:
A single pooling operator is used:
c) Feature Vector and Classification:
Each kernel–dilation-bias produces a scalar . Stacking all such outputs yields a feature vector (typically ). Classification is performed via a linear model, most often ridge regression:
These design constraints ensure that miniRocket is agnostic to absolute offsets, highly efficient, and numerically stable (Dempster et al., 2020, Lo et al., 2024).
2. Architectural Innovations and Theoretical Properties
MiniRocket eschews the fully random kernel and pooling design of ROCKET in favor of:
- Fixed kernels: 84 length-9 binary-valued patterns, sampled once and reused for all data.
- Log-spaced dilations: Ensuring multi-scale feature coverage.
- Bias quantile selection: Biases for each (kernel, dilation) are derived from quantiles of the convolution output, enabling near-determinism; a fully deterministic variant aggregates over all training data.
- Pooling simplification: Only the PPV pooling statistic is used, halving the feature count and compute compared to ROCKET.
- Efficient implementation: All kernels have integer weights (3 × +2, 6 × –1), facilitating vectorized addition-only convolution and feature computation.
This structure produces high discriminability, with theoretical invariance to additive constants and reflection, and allows for parallelized implementation (Dempster et al., 2020, Lo et al., 2024).
3. Computational Complexity and Empirical Performance
MiniRocket’s transform time is per sample, where is the number of dilations. Empirical runtime and memory requirements are low (e.g., $0.2-0.3$ s per dataset in standard UCR TSC benchmarks, -dimensional features, negligible memory overhead for most practical scenarios) (Lo et al., 2024, Dempster et al., 2020).
Performance benchmarks include:
- UCR archive (112 datasets): Average accuracy ; MiniRocket is in the highest accuracy “clique” with TS-CHIEF, HIVE-COTE/TDE, and InceptionTime, but with faster transform (Lo et al., 2024, Dempster et al., 2020).
- CMAPSS prognostics: Classification accuracy with Ridge; SVM/head shrinkage LDA on MiniRocket features yields up to 96% (Wu et al., 2022).
- EEG (MI-BCI tasks): 98.63% accuracy (MiniRocket+ridge); outperformed CNN-LSTM hybrid (Hwaidi et al., 22 Aug 2025).
- Wearable IMU (real-time dance): ms end-to-end latency for 24-channel input, maintaining accuracy (Cai et al., 4 Nov 2025).
- Spectral classification for low-data regimes: Outperforms small 1D CNNs when samples per class are limited (Theisen et al., 17 Sep 2025).
4. Algorithmic Extensions and Variants
Several important MiniRocket extensions have been developed:
- SelF-Rocket: Augments MiniRocket by searching over input representations and additional pooling operators (GMP, MPV, MIPV, LSPV), selecting the optimal (input, pooling) pair via a lightweight cross-validation wrapper. SelF-Rocket yields a accuracy gain over MiniRocket, matching larger ensembles in accuracy with minor additional cost (2s per dataset vs. 0.3s for MiniRocket) (Lo et al., 2024).
- HDC-MiniROCKET: Replaces plain PPV pooling with explicit timestamp binding using hyperdimensional computing, enabling positional encoding (critical for tasks with global temporal structure). HDC-MiniROCKET achieves higher accuracy on temporal-dependence tasks and matches MiniRocket speed (Schlegel et al., 2022, Theisen et al., 17 Sep 2025).
- Multichannel applications: MiniRocket kernels can be applied per channel and concatenated; stacking over all channels provides robust features for multivariate sequences, as demonstrated in IMU, EEG, and event log fault prediction (Homm et al., 17 Apr 2025, Vargas et al., 2023).
5. Applications Across Domains and Data Modalities
MiniRocket has been applied across a wide range of time series classification and forecasting tasks:
- Human activity recognition: MiniRocket yields F1 and AUC on smartphone sensor data; competitive with XGBoost, faster than deep RNNs (Alagoz, 2024).
- Medical signals: Used for EEG motor imagery decoding (98.63% accuracy), ECG morphology embeddings in fusion networks (macro ROC-AUC 0.85), and robust to inter-subject variability (Nguyen et al., 29 Dec 2025, Hwaidi et al., 22 Aug 2025).
- Industrial and remote sensing: Fast, deterministic prediction of ATM faults and event-log anomaly detection, where it substantially outperforms deep and non-temporal ML baselines under severe class imbalance (Vargas et al., 2023).
- Hyperspectral pixel classification: Achieves superior performance to CNNs in low-data regimes with no feature extractor parameters to overfit (Theisen et al., 17 Sep 2025).
- Real-time systems: Latency ms for wearable sensor windows, supporting live adaptive multimedia or feedback (Cai et al., 4 Nov 2025).
6. Limitations and Contextual Considerations
While MiniRocket is robust and universally applicable, key limitations include:
- Temporal localization: PPV pooling discards feature location information; HDC-MiniROCKET remedies this where necessary (Schlegel et al., 2022).
- Global context: For tasks requiring explicit timing or positional encoding, vanilla MiniRocket is suboptimal (see synthetic “catastrophic failure” toy datasets).
- Pooling operator rigidity: Single-statistic pooling (PPV) can underperform where max, mean, or length-segment features are critical, motivating variants like SelF-Rocket (Lo et al., 2024).
- Classical linear head: Linear classifier suffices for many regimes but can be outperformed by SVM or shrinkage-LDA in high-dimensional, imbalanced settings (Wu et al., 2022).
7. Impact, Best Practices, and Current Status in TSC
MiniRocket is widely adopted as the reference fast, scalable fixed-feature transform for time series classification, particularly:
- When computational speed or reproducibility is essential.
- In data-scarce environments or real-time pipelines, due to parameter-free feature extraction.
- As a drop-in encoder for hybrid neural systems, or as a baseline for benchmarking new TSC algorithms.
Empirical studies consistently place MiniRocket among the most accurate and computationally efficient TSC methods available, with a technical profile that favors simplicity, scalability, and hardware-agnostic parallelization. It remains the default choice for most TSC settings outside of deep ensemble models or tasks with strong global temporal structure(Dempster et al., 2020, Lo et al., 2024, Homm et al., 17 Apr 2025, Nguyen et al., 29 Dec 2025, Schlegel et al., 2022).