Self-Training Neurochaos Learning
- Self-Training Neurochaos Learning is a hybrid semi-supervised framework that fuses chaos-driven feature encoding with iterative pseudo-labeling.
- It uses chaotic maps to transform features into robust firing-rate representations, capturing latent nonlinear relationships.
- The iterative self-training with a confidence threshold progressively augments labeled data, yielding superior accuracy in imbalanced scenarios.
Self-Training Neurochaos Learning (NL+ST) is a hybrid semi-supervised learning (SSL) framework that combines Neurochaos Learning (NL)—where chaos-based feature transformations reveal latent nonlinear structure—with threshold-based Self-Training (ST) to exploit large quantities of unlabelled data when only a small fraction of samples are labelled. The approach addresses scenarios in which obtaining labelled data is expensive or challenging, particularly for nonlinear or imbalanced classification tasks. NL+ST integrates robust chaos-driven feature engineering with iterative pseudo-labelling of high-confidence unlabelled points, resulting in superior generalisation and classification accuracy relative to conventional SSL techniques (M et al., 3 Jan 2026).
1. Motivation and Theoretical Foundation
Many practical machine learning applications are characterized by a paucity of labelled data and abundant unlabelled samples. Supervised approaches typically overfit or fail to extrapolate in such conditions, notably when the data exhibits strong nonlinearities or class imbalance. SSL leverages unlabelled examples to address this gap, but popular variants may inadequately capture subtle feature relationships.
Neurochaos Learning transforms each sample’s raw features using chaotic dynamics. The output is a “firing-rate” representation—an embedding encoding the response of chaotic neurons to individual features, designed to be robust under limited supervision and resilient to input noise. By integrating NL with threshold-based ST, the framework both distils complex data structure into noise-resistant representations and iteratively expands the labelled set with reliable pseudo-labels, amplifying supervised signal (M et al., 3 Jan 2026).
2. Neurochaos Learning: Chaotic Feature Encoding
NL implements a three-phase pipeline for each feature of input sample :
- Preprocessing: Each raw feature is scaled to .
- Chaotic Encoding: A chaotic neuron for each feature is initialized at . The chaotic map (e.g., skew tent map) is iteratively applied until the trajectory visits the -ball about :
The number of required iterations is .
- Symbolic Encoding and Firing-Rate Extraction: The sequence is thresholded at to obtain a binary symbolic sequence:
The firing-rate feature is then:
Each sample is thus encoded as with .
NL-generated embeddings have been found to expose nonlinear separabilities obscured in the original feature space and are especially valuable when used with straightforward classifiers in low-data or noisy settings (M et al., 3 Jan 2026).
3. Threshold-Based Self-Training: Pseudo-Label Expansion
Threshold-based Self-Training operates by iteratively augmenting the labelled dataset:
- Let denote the current labelled set (initially 15% of the total), and the current unlabelled set (initially 85%).
- The base classifier (Random Forest, AdaBoost, SVM, Logistic Regression, Gaussian Naïve Bayes) is trained on .
- For each , class posteriors are predicted. Pseudo-label is retained if model confidence satisfies , where :
- Update , and iterate. The process terminates when no new high-confidence assignments are made.
This selective, high-confidence assignment reduces the risk of erroneous label propagation and stabilizes training (M et al., 3 Jan 2026).
4. NL+ST Architecture and Implementation Pipeline
NL+ST comprises a full pipeline integrating chaos-based feature encoding and self-training, as formalized below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
Input: D = {(x_i)}, labels for 15% of D, map f, b, ε, τ, classifier C
Output: trained classifier C*
1. Scale features to [0,1].
2. For each sample x_i, each feature j:
a. Set u ← q.
b. Iterate u ← f(u) till |u - x_{ij}| < ε, count T_{ij}.
c. s^{(t)} ← [u ≥ b].
d. FR_{ij} ← (1 / T_{ij}) * sum(s^{(t)}).
3. FR_i = [FR_{i1},...,FR_{id}]
4. Partition FR-data: L (15% labelled), U (85% unlabelled).
5. Repeat:
a. Train C on L.
b. For x in U: ρ(x) = max_y p(y|x).
c. P ← {(x,ŷ): ρ(x) ≥ τ}.
d. L ← L ∪ P; U ← U \ {x | (x,·) ∈ P}.
Until P is empty.
6. Return final C. |
Key hyperparameters are fixed as: (maximizes symbolic entropy), , (pseudo-labeling threshold), and initial chaotic state is optimized via 5-fold cross-validation for each dataset/classifier pair (M et al., 3 Jan 2026).
5. Experimental Evaluation and Performance Analysis
Ten benchmark datasets were employed: Iris, Wine, Breast Cancer Wisconsin, Haberman’s Survival, Ionosphere, Statlog (Heart), Seeds, Palmer Penguins, Pima Indians Diabetes, Glass Identification. Standard protocol: 80%/20% train/test split, with only 15% of the train set labelled (), 85% unlabelled (). Macro-F1 score measured generalisation under class imbalance.
The NL+ST approach consistently produced higher macro-F1 scores than standalone ST, with especially marked improvements in small, nonlinear, and imbalanced datasets. Notable gains included:
| Dataset | Classifier | NL+ST Performance Gain Over ST |
|---|---|---|
| Iris | LR | +188.66% |
| Wine | LR | +158.58% |
| Glass Identification | RF | +110.48% |
Full results by dataset and base classifier can be found in the master tables of (M et al., 3 Jan 2026).
6. Architectural Significance, Limitations, and Prospective Extensions
Chaos-derived firing-rate features substantially enhance the separability of nonlinear clusters, empowering conventional classifiers in low-label and noisy settings. The ST loop incrementally increases labelled coverage while maintaining strict confidence, mitigating the risk of systematic label error amplification.
Limitations include sensitivity to chaotic map parameters (, , ), which necessitates cross-validation and may limit stability. A single confidence threshold may be suboptimal; adaptive or curriculum-based thresholding represents a promising avenue for robustness.
Proposed Extensions encompass:
- Adaptive/curriculum-based pseudo-labeling for dynamic confidence calibration.
- Integration of chaos-derived feature embeddings into neural or deep learning pipelines.
- Unsupervised pretraining with chaotic transformations for tasks such as clustering or anomaly detection (M et al., 3 Jan 2026).
7. Relationship to Broader SSL Research and Application Domains
NL+ST exemplifies advances in SSL that blend principled feature engineering rooted in chaos theory with iterative, high-confidence self-labelling. Its impact is principally pronounced in domains where nonlinear relationships are prevalent and labelled data is scarce—including scientific instrumentation, biomedical diagnostics, and rare-event detection. The methodology also underscores the value of interpretable dynamic representations for resilient, data-efficient learning (M et al., 3 Jan 2026).