Kolmogorov-Arnold Neuro-Fuzzy Inference
- KANFIS is a neuro-symbolic system that uses additive fuzzy rule superposition to overcome exponential rule complexity.
- It leverages the Kolmogorov–Arnold representation to achieve linear parameter scaling and explicitly model uncertainty.
- Empirical results demonstrate that KANFIS outperforms traditional ANFIS and neural baselines while offering interpretable and sparse rule sets.
The Kolmogorov-Arnold Neuro-Fuzzy Inference System (KANFIS) is a neuro-symbolic framework designed to address the challenge of exponential rule complexity in neuro-fuzzy inference by leveraging the Kolmogorov–Arnold additive representation. By unifying interpretable fuzzy reasoning with the additive decomposition of multivariate functions, KANFIS achieves both linear parameter scaling and explicit uncertainty modeling, while maintaining semantically transparent rule sets and competitive empirical performance relative to established neuro-fuzzy and neural baselines (Yong et al., 3 Feb 2026).
1. Mathematical Foundations
KANFIS builds on the classical Kolmogorov–Arnold superposition theorem, which states that any continuous multivariate function can be represented as
with each and a univariate continuous function. This decomposition motivates an alternative to the product-based rule firing employed in conventional Adaptive Neuro-Fuzzy Inference Systems (ANFIS).
In traditional ANFIS, Takagi–Sugeno–Kang (TSK) fuzzy system rules use the firing strength
where is the membership function for the -th feature and -th rule. This rule formulation requires rules for fuzzy sets per input and input dimensions, rapidly leading to intractable model sizes in high dimensions.
KANFIS replaces the product-based aggregation with an additive superposition. For rules, each rule consists of univariate fuzzy transforms per feature. For each input dimension and rule , fuzzy basis functions are learned. The soft-antecedent is computed as
where is a Type-1 or Interval Type-2 (IT2) membership value. The total additive rule activation is
A direct consequence is that both rule count and parameter complexity now scale linearly with instead of exponentially, conditioned on the number of rules (Yong et al., 3 Feb 2026).
2. Architecture and Structural Components
2.1 KANFIS Layer Structure
A KANFIS layer receives input . For each edge between input and rule , fuzzy basis functions are learned:
Aggregation occurs across bases and features:
Multiple such layers can be stacked, with each output vector renormalized: .
The final output is produced according to a Takagi–Sugeno linear consequent:
2.2 Sparse Masking Mechanism
To enhance interpretability by restricting each rule to a limited subset of features, KANFIS applies a soft mask , yielding
pushes toward binarization. Distinctiveness among rules is encouraged by penalizing high pairwise cosine similarity of rule activations:
3. Fuzzy Logic and Uncertainty Representation
KANFIS supports both Type-1 and IT2 fuzzy logic.
3.1 Type-1 Fuzzy Sets
Type-1 membership functions can use Gaussian, Generalized Bell, or Sigmoid forms. The Gaussian type is defined as:
where and are learnable parameters.
3.2 Interval Type-2 Fuzzy Sets
Interval Type-2 (IT2) fuzzy sets model additional uncertainty. For each basis, two widths define upper and lower membership functions:
The crisp activation is their average. The region between these curves defines the Footprint of Uncertainty (FOU), providing explicit quantification of ambiguity in the fuzzy representation (Yong et al., 3 Feb 2026).
4. Learning, Optimization, and Regularization
The KANFIS training objective combines standard regression or classification loss with regularizers for sparsity and distinctiveness:
All parameters are optimized by backpropagation:
- Membership centers and widths are updated via chain-rule derivatives.
- The soft mask receives a combined update from the task loss and the entropy regularizer.
- Takagi–Sugeno consequent weights and bias use standard linear updates.
This joint optimization enforces structural properties—sparsity (for feature selection per rule) and rule distinctiveness—alongside convergence on the predictive task.
5. Model Complexity, Scalability, and Interpretability
KANFIS fundamentally alters the curse of dimensionality characteristic of traditional neuro-fuzzy inference. In a conventional ANFIS system with fuzzy sets per feature, the required number of rules is , resulting in parameters.
KANFIS instead requires only rules, each with fuzzy bases, producing parameters and rule complexity that scales linearly in , with .
Rule semantics are enhanced by mask-enforced sparsity: at convergence, each hidden unit corresponds to a rule of the form, “IF is in fuzzy set for those with , THEN output contribution is .” Thus, rules are concise and human-interpretable, and rule sets are compact and easily examined by domain experts.
6. Empirical Evaluation
Empirical results on five benchmark datasets indicate that both Type-1 and IT2 variants of KANFIS match or outperform baseline multilayer perceptron (MLP), ANFIS, and Kolmogorov–Arnold Network (KAN) models in regression and classification tasks. On the Combined Cycle Power Plant (CCPP) regression dataset:
- MLP: RMSE = 4.1883
- T1-ANFIS: RMSE = 3.9980
- T1-KANFIS: RMSE = 3.9542
- IT2-KANFIS: RMSE = 4.1240
For classification datasets including Breast Cancer, Spambase, and Medical Health Records, KANFIS achieves accuracy and F1 scores in the range $0.93–0.99$, generally outperforming both ANFIS and deep neural baselines while retaining a small, interpretable set of fuzzy rules (Yong et al., 3 Feb 2026).
| Model | Rule/Param Scaling | Explicit Uncertainty | Interpretable Rules | Empirical RMSE (CCPP) |
|---|---|---|---|---|
| ANFIS | Exponential () | No | No | 3.9980 |
| KANFIS (T1) | Linear () | No | Yes | 3.9542 |
| KANFIS (IT2) | Linear () | Yes | Yes | 4.1240 |
| MLP | — | No | No | 4.1883 |
The data suggest that KANFIS architecture offers both scalability and interpretability, as well as accurate and uncertainty-aware predictions, by leveraging additive fuzzy rule superposition and explicit rule sparsity controls (Yong et al., 3 Feb 2026).