Papers
Topics
Authors
Recent
Search
2000 character limit reached

FlowKANet: Graph Neural Delay Predictor

Updated 31 December 2025
  • The paper introduces a framework that replaces traditional MLPs with spline-based KAN layers, achieving a 5× reduction in trainable parameters compared to baseline GNNs.
  • FlowKANet employs iterative message passing with KAMP-Attn, using bidirectional edge processing and attention mechanisms to refine node representations in bipartite graphs.
  • The approach enables the derivation of symbolic surrogate models, offering interpretable analytical predictors suitable for real-time, resource-constrained network delay forecasting.

FlowKANet is a graph neural network framework for delay prediction in communication networks that incorporates Kolmogorov-Arnold Networks (KAN) at all neural layers, enabling robust nonlinear modeling with substantial gains in parameter efficiency and interpretability. By replacing standard multi-layer perceptrons (MLPs) within heterogeneous message-passing blocks with spline-based KAN modules, FlowKANet maintains the graph-structured inductive bias and attention mechanisms of classical GNNs, while facilitating the derivation of symbolic surrogates for lightweight deployment and transparent inference (Marouani et al., 24 Dec 2025).

1. Architecture and Computational Graph

FlowKANet models communication networks as bipartite, heterogeneous graphs: G=(VfV,E)\mathcal G=(\mathcal V_f \cup \mathcal V_\ell, \mathcal E) with Vf\mathcal V_f denoting flow nodes, V\mathcal V_\ell link nodes, and E\mathcal E the set of directed edges encoding the bidirectional relationships. The architecture performs KK rounds of message passing:

  • Initial node features xfx_f (flows) and xx_\ell (links) are encoded via KAN blocks into respective embeddings hf(0)\mathbf h_f^{(0)} and h(0)\mathbf h_\ell^{(0)}.
  • At each layer kk, directed edges (uv)(u \to v) utilize two shared KAN operators: a transformation TKANuv\mathcal T^{\mathrm{KAN}_{u\to v}} (maps sender’s embedding to a candidate message) and an attention operator AKANuv\mathcal A^{\mathrm{KAN}_{u\to v}} (produces normalized attention weights).
  • Flow→link and link→flow edges are processed iteratively, updating node representations via residual aggregation: hv(k+1)=hv(k)+uN(v)αuv(k)h~uv(k)\mathbf h_v^{(k+1)} = \mathbf h_v^{(k)} + \sum_{u\in\mathcal N(v)} \alpha_{uv}^{(k)}\,\tilde{\mathbf h}_{uv}^{(k)}
  • After all KK layers, each flow node ff aggregates its final link context and fuses it (via a KAN block) before passing through the KAN readout for delay prediction d^f\hat d_f.

This design yields a 5× reduction in trainable parameters compared to baseline GNN architectures, primarily by substituting dense MLPs with parameter-efficient spline-based KAN layers (Marouani et al., 24 Dec 2025).

2. Kolmogorov-Arnold Network Layer Formalism

KAN layers leverage the Kolmogorov-Arnold superposition theorem, admitting universal approximation of all continuous nn-variate functions: f(x1,,xn)=q=02n  ϕq(p=1nψq,p(xp))f(x_1,\dots,x_n) = \sum_{q=0}^{2n}\; \phi_q\left(\sum_{p=1}^n \psi_{q,p}(x_p)\right) Here, ψq,p\psi_{q,p} and ϕq\phi_q are learnable univariate splines discretized by their values at GG grid points, interpolated via B-splines of order kk. For mm-dimensional outputs: y=W[u0,u1,,u2n]\mathbf y = W\left[u_0, u_1, \ldots, u_{2n}\right]^\top with uq(x)=ϕq(zq(x))u_q(\mathbf x) = \phi_q(z_q(\mathbf x)), zq(x)=p=1nψq,p(xp)z_q(\mathbf x) = \sum_{p=1}^n \psi_{q,p}(x_p), and WW a linear mixing layer. This structure enforces global sparsity and compositionality while retaining full representational power (Marouani et al., 24 Dec 2025).

3. Message Passing and Attention via KAMP-Attn

Central to FlowKANet is KAMP-Attn (Kolmogorov-Arnold Message Passing with Attention). For each edge (uv)(u \to v):

  • The transformation TKANuv:RduRdv\mathcal T^{\mathrm{KAN}_{u\to v}}:\mathbb R^{d_u}\rightarrow\mathbb R^{d_v} yields message features h~uv(k)\tilde{\mathbf h}_{uv}^{(k)}.
  • The attention AKANuv:RdvR\mathcal A^{\mathrm{KAN}_{u\to v}}: \mathbb R^{d_v}\rightarrow\mathbb R computes scalar scores after LeakyReLU activation: suv(k)=AKANuv(LeakyReLU(hv(k)+h~uv(k)))s_{uv}^{(k)} = \mathcal A^{\mathrm{KAN}_{u\to v}}\left(\mathrm{LeakyReLU}(\mathbf h_v^{(k)}+\tilde{\mathbf h}_{uv}^{(k)})\right) Scores are softmax-normalized within the recipient’s neighborhood: αuv(k)=exp(suv(k))wN(v)exp(swv(k))\alpha_{uv}^{(k)} = \frac{\exp(s_{uv}^{(k)})}{\sum_{w\in\mathcal N(v)}\exp(s_{wv}^{(k)})}
  • Node vv’s embedding is updated as: hv(k+1)=hv(k)+uN(v)αuv(k)h~uv(k)\mathbf h_v^{(k+1)} = \mathbf h_v^{(k)} + \sum_{u\in\mathcal N(v)}\alpha_{uv}^{(k)}\tilde{\mathbf h}_{uv}^{(k)}

All hyperparameters—grid size GG, order kk, scale σ\sigma—are block-wise optimized via Optuna, ensuring uniform spline expressivity across all message-passing layers (Marouani et al., 24 Dec 2025).

4. Model Complexity and Scalability Analysis

FlowKANet demonstrates substantial parameter compression and computational gains:

Model Trainable Parameters Per-forward Complexity
Baseline GNN 98,210 O(KEdh2)O\left(K|\mathcal E|d_h^2\right)
FlowKANet 20,094 O(KEGdh)O\left(K|\mathcal E|G d_h\right)
Symbolic Surrogate 267 constants O(E)O\left(|\mathcal E|\right)

K=3K=3 (layers), dhd_h=hidden dimension (flows: 8, links: 2), and G5G\approx 5–$10$ (KAN grid size). KAN layers incur only a linear cost in grid size (GG), while MLP-based GNNs scale quadratically in dhd_h. Symbolic surrogates, distilled from trained FlowKANet weights, reduce inference to a minimal set of arithmetic operations (Marouani et al., 24 Dec 2025).

5. Training Procedures and Hyperparameter Optimization

Experiments utilize the GNNet Challenge dataset: 4,389 graphs (train: 3,511; test: 878).

  • Loss function: mean squared error (MSE), L=1Nf(d^fdf)2\mathcal L=\frac{1}{N}\sum_f(\hat d_f - d_f)^2
  • Optimizer: Adam, learning rate 2×1032\times10^{-3}, dropout $0.1$ between KAN blocks, no weight decay.
  • Early stopping: based on validation MSE (10% split), patience =20=20 epochs, maximum =150=150 epochs.
  • Optuna (TPE sampler): hyperparameter sweep over hidden dimensions {8,16,32}\{8,16,32\}, layers K{2,3,4}K\in\{2,3,4\}, KAN grid G[5,10]G\in[5,10], spline order k[3,5]k\in[3,5], scale σ[0.3,2.5]\sigma\in[0.3,2.5], activation placements.

Convergence typically occurs in 60–80 epochs for both baseline and FlowKANet. Symbolic surrogate distillation, via block-wise regression (250 trials per block), completes within ~2 hours (Marouani et al., 24 Dec 2025).

6. Empirical Performance and Predictive Accuracy

On the GNNet test set (13,704 flows): $\begin{array}{l|c|c|c} \text{Model} & \mathrm{MSE}\ (\downarrow) & \mathrm{RMSE}\ (\downarrow) & R^2\ (\uparrow) \ \hline \text{Baseline GNN} & 38.6358 & 6.214 & 0.8113 \ \text{FlowKANet} & 40.8094 & 6.388 & 0.8727 \ \text{Symbolic Surrogate} & 54.8562 & 7.407 & 0.8290 \end{array}$ FlowKANet preserves explained variance while incurring only minor RMSE degradation compared to standard GNNs. The surrogate model, with fully analytical expressions, sacrifices some predictive accuracy for maximal transparency (Marouani et al., 24 Dec 2025).

7. Symbolic Surrogate Distillation and Analytical Predictors

FlowKANet models are distilled to symbolic surrogates using PySR via block-wise symbolic regression.

  • Blocks are sequentially replaced with closed-form polynomial/rational functions, trained against frozen upstream activations.
  • The final predictor maintains the graph-structured dependencies: d^f=P(Sf(xf),N(f)L)\hat d_f = P(S_f(x_f), \sum_{\ell\in\mathcal N(f)}L_\ell) where Sf(xf)S_f(x_f) is a quadratic function of flow features, P(,)P(\cdot,\cdot) is a log-linear combination of Sf(xf)S_f(x_f) and aggregated link loads, parameterized by 267 discovered constants. Importantly, all GNN-style neighborhood summations are retained, embedding the original inductive bias within the analytical form.

The full surrogate achieves O(E)O(|\mathcal E|) inference complexity, suitable for microsecond-scale execution on resource-constrained hardware (Marouani et al., 24 Dec 2025).

8. Practical Considerations, Trade-offs, and Deployment

Three modeling tiers balance accuracy, efficiency, and interpretability:

  • Baseline GNN: Maximizes accuracy (RMSE=6.21\mathrm{RMSE}=6.21), but high memory/compute requirements and opaque internals.
  • FlowKANet: Retains competitive accuracy (RMSE=6.39\mathrm{RMSE}=6.39), offers spline-based function transparency, and reduces parameter footprint by \sim5×.
  • Symbolic Surrogate: Slightly degraded accuracy (RMSE=7.41\mathrm{RMSE}=7.41), minimal storage (267 constants), fully interpretable and suitable for real-time inference.

Deployment options span real-time network management on edge devices (symbolic surrogate), adaptive traffic engineering at the edge (FlowKANet), and maximal-accuracy settings in data centers (baseline GNN). The unified graph-message-passing backbone ensures consistent graph inductive bias and seamless transition across tiers (Marouani et al., 24 Dec 2025).

FlowKANet, by synergizing KAN universal approximation with GNN message passing and by enabling symbolic regression, establishes an efficient, interpretable, and highly adaptive framework for network delay prediction. The approach connects modern graph learning to the classical theory of function superposition and demonstrates broad applicability across resource-constrained and critical control contexts.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to FlowKANet.