FlowKANet: Graph Neural Delay Predictor
- The paper introduces a framework that replaces traditional MLPs with spline-based KAN layers, achieving a 5× reduction in trainable parameters compared to baseline GNNs.
- FlowKANet employs iterative message passing with KAMP-Attn, using bidirectional edge processing and attention mechanisms to refine node representations in bipartite graphs.
- The approach enables the derivation of symbolic surrogate models, offering interpretable analytical predictors suitable for real-time, resource-constrained network delay forecasting.
FlowKANet is a graph neural network framework for delay prediction in communication networks that incorporates Kolmogorov-Arnold Networks (KAN) at all neural layers, enabling robust nonlinear modeling with substantial gains in parameter efficiency and interpretability. By replacing standard multi-layer perceptrons (MLPs) within heterogeneous message-passing blocks with spline-based KAN modules, FlowKANet maintains the graph-structured inductive bias and attention mechanisms of classical GNNs, while facilitating the derivation of symbolic surrogates for lightweight deployment and transparent inference (Marouani et al., 24 Dec 2025).
1. Architecture and Computational Graph
FlowKANet models communication networks as bipartite, heterogeneous graphs: with denoting flow nodes, link nodes, and the set of directed edges encoding the bidirectional relationships. The architecture performs rounds of message passing:
- Initial node features (flows) and (links) are encoded via KAN blocks into respective embeddings and .
- At each layer , directed edges utilize two shared KAN operators: a transformation (maps sender’s embedding to a candidate message) and an attention operator (produces normalized attention weights).
- Flow→link and link→flow edges are processed iteratively, updating node representations via residual aggregation:
- After all layers, each flow node aggregates its final link context and fuses it (via a KAN block) before passing through the KAN readout for delay prediction .
This design yields a 5× reduction in trainable parameters compared to baseline GNN architectures, primarily by substituting dense MLPs with parameter-efficient spline-based KAN layers (Marouani et al., 24 Dec 2025).
2. Kolmogorov-Arnold Network Layer Formalism
KAN layers leverage the Kolmogorov-Arnold superposition theorem, admitting universal approximation of all continuous -variate functions: Here, and are learnable univariate splines discretized by their values at grid points, interpolated via B-splines of order . For -dimensional outputs: with , , and a linear mixing layer. This structure enforces global sparsity and compositionality while retaining full representational power (Marouani et al., 24 Dec 2025).
3. Message Passing and Attention via KAMP-Attn
Central to FlowKANet is KAMP-Attn (Kolmogorov-Arnold Message Passing with Attention). For each edge :
- The transformation yields message features .
- The attention computes scalar scores after LeakyReLU activation: Scores are softmax-normalized within the recipient’s neighborhood:
- Node ’s embedding is updated as:
All hyperparameters—grid size , order , scale —are block-wise optimized via Optuna, ensuring uniform spline expressivity across all message-passing layers (Marouani et al., 24 Dec 2025).
4. Model Complexity and Scalability Analysis
FlowKANet demonstrates substantial parameter compression and computational gains:
| Model | Trainable Parameters | Per-forward Complexity |
|---|---|---|
| Baseline GNN | 98,210 | |
| FlowKANet | 20,094 | |
| Symbolic Surrogate | 267 constants |
(layers), =hidden dimension (flows: 8, links: 2), and –$10$ (KAN grid size). KAN layers incur only a linear cost in grid size (), while MLP-based GNNs scale quadratically in . Symbolic surrogates, distilled from trained FlowKANet weights, reduce inference to a minimal set of arithmetic operations (Marouani et al., 24 Dec 2025).
5. Training Procedures and Hyperparameter Optimization
Experiments utilize the GNNet Challenge dataset: 4,389 graphs (train: 3,511; test: 878).
- Loss function: mean squared error (MSE),
- Optimizer: Adam, learning rate , dropout $0.1$ between KAN blocks, no weight decay.
- Early stopping: based on validation MSE (10% split), patience epochs, maximum epochs.
- Optuna (TPE sampler): hyperparameter sweep over hidden dimensions , layers , KAN grid , spline order , scale , activation placements.
Convergence typically occurs in 60–80 epochs for both baseline and FlowKANet. Symbolic surrogate distillation, via block-wise regression (250 trials per block), completes within ~2 hours (Marouani et al., 24 Dec 2025).
6. Empirical Performance and Predictive Accuracy
On the GNNet test set (13,704 flows): $\begin{array}{l|c|c|c} \text{Model} & \mathrm{MSE}\ (\downarrow) & \mathrm{RMSE}\ (\downarrow) & R^2\ (\uparrow) \ \hline \text{Baseline GNN} & 38.6358 & 6.214 & 0.8113 \ \text{FlowKANet} & 40.8094 & 6.388 & 0.8727 \ \text{Symbolic Surrogate} & 54.8562 & 7.407 & 0.8290 \end{array}$ FlowKANet preserves explained variance while incurring only minor RMSE degradation compared to standard GNNs. The surrogate model, with fully analytical expressions, sacrifices some predictive accuracy for maximal transparency (Marouani et al., 24 Dec 2025).
7. Symbolic Surrogate Distillation and Analytical Predictors
FlowKANet models are distilled to symbolic surrogates using PySR via block-wise symbolic regression.
- Blocks are sequentially replaced with closed-form polynomial/rational functions, trained against frozen upstream activations.
- The final predictor maintains the graph-structured dependencies: where is a quadratic function of flow features, is a log-linear combination of and aggregated link loads, parameterized by 267 discovered constants. Importantly, all GNN-style neighborhood summations are retained, embedding the original inductive bias within the analytical form.
The full surrogate achieves inference complexity, suitable for microsecond-scale execution on resource-constrained hardware (Marouani et al., 24 Dec 2025).
8. Practical Considerations, Trade-offs, and Deployment
Three modeling tiers balance accuracy, efficiency, and interpretability:
- Baseline GNN: Maximizes accuracy (), but high memory/compute requirements and opaque internals.
- FlowKANet: Retains competitive accuracy (), offers spline-based function transparency, and reduces parameter footprint by 5×.
- Symbolic Surrogate: Slightly degraded accuracy (), minimal storage (267 constants), fully interpretable and suitable for real-time inference.
Deployment options span real-time network management on edge devices (symbolic surrogate), adaptive traffic engineering at the edge (FlowKANet), and maximal-accuracy settings in data centers (baseline GNN). The unified graph-message-passing backbone ensures consistent graph inductive bias and seamless transition across tiers (Marouani et al., 24 Dec 2025).
FlowKANet, by synergizing KAN universal approximation with GNN message passing and by enabling symbolic regression, establishes an efficient, interpretable, and highly adaptive framework for network delay prediction. The approach connects modern graph learning to the classical theory of function superposition and demonstrates broad applicability across resource-constrained and critical control contexts.