Papers
Topics
Authors
Recent
Search
2000 character limit reached

PO-CKAN: Physics-Informed Operator Learning

Updated 6 March 2026
  • PO-CKAN is a neural operator learning framework that integrates a DeepONet branch–trunk architecture with chunkwise rational KAN modules to efficiently approximate parametric PDE solutions.
  • Its chunkwise rational activations reduce parameter count and FLOPs while maintaining full connectivity, leading to improved shock capturing and detailed geometry resolution.
  • PO-CKAN employs composite PINN-style loss functions to enforce physical constraints, achieving enhanced physical fidelity and sharper convergence across diverse PDE benchmarks.

PO-CKAN (Physics-Informed Deep Operator Kolmogorov–Arnold Network with Chunk Rational Structure) is a neural operator learning framework that integrates DeepONet-style architecture with chunkwise rational Kolmogorov–Arnold Network (CKAN) modules and enforces physics-informed constraints. It is specifically engineered to approximate solution operators for parametric families of partial differential equations (PDEs) and achieves substantial improvements in expressivity, efficiency, and physical fidelity compared to standard MLP or KAN-based approaches (Wu et al., 9 Oct 2025).

1. Architecture: DeepONet with CKAN Sub-networks

PO-CKAN is built upon the DeepONet “branch–trunk” paradigm for operator learning:

  • Branch Network (Bₜ): Encodes the input function u(y)u(y) (e.g., initial/boundary condition, discretized over pp sensors) into a feature vector b=(b1,...,bp)Rpb = (b_1, ..., b_p) \in \mathbb{R}^p.
  • Trunk Network (Tₜ): Maps a spatio-temporal coordinate xx to pp basis functions t(x)=(t1(x),...,tp(x))Rpt(x) = (t_1(x), ..., t_p(x)) \in \mathbb{R}^p.
  • Operator Output: The solution at coordinate xx is obtained as

s(x)=G(u)(x)k=1pbktk(x)s(x) = G(u)(x) \approx \sum_{k=1}^p b_k t_k(x)

(see Eq. (1) in (Wu et al., 9 Oct 2025)).

Uniquely, both branch and trunk networks are instantiated as chunkwise rational KANs (CKANs), rather than standard MLPs or vanilla KANs.

CKAN Layer

CKAN addresses the quadratic parameter bottleneck of vanilla KANs by:

  • Rational Activations: Each edge activation is parameterized as

φ(x)=wF(x),F(x)=ax+b+k=1n/2ckx+dk(xek)2+fk2+ϵ\varphi(x) = w F(x), \quad F(x) = a x + b + \sum_{k=1}^{n/2} \frac{c_k x + d_k}{(x-e_k)^2 + f_k^2 + \epsilon}

where {a,b,ck,dk,ek,fk}\{a, b, c_k, d_k, e_k, f_k\} are learnable, degree nn is even, and ϵ>0\epsilon > 0 ensures no poles (Eq. (4)).

  • Chunkwise Sharing: Inputs/outputs are partitioned into c×cc \times c chunks. Within each chunk, edges share the same rational base function Fm,n()F_{m,n}(\cdot) but maintain individual scalar weights wijw_{ij}. This reduces parameter count and FLOPs from O(dindout)O(d_\text{in}\cdot d_\text{out}) to O(c2)O(c^2) for the rational basis, while preserving full connectivity.

Parameter and FLOP comparison for a single layer:

Model Parameters FLOPs
MLP dindout+doutd_\text{in} \cdot d_\text{out} + d_\text{out}
KAN dindout(G+K+3)+dout\sim d_\text{in} \cdot d_\text{out} \cdot (G+K+3) + d_\text{out} O(dindoutKG)\sim O(d_\text{in} \cdot d_\text{out} \cdot K \cdot G)
CKAN dindout+dout+(2n+2)c2d_\text{in} \cdot d_\text{out} + d_\text{out} + (2n+2)c^2 (4.5n+1)dinc+2(dindout)(4.5 n+1) d_\text{in} c + 2(d_\text{in} d_\text{out})

This organization delivers the expressivity of KANs with a tractable memory and computational profile (Wu et al., 9 Oct 2025).

2. Physics-Informed Losses and Constraints

PO-CKAN enforces the underlying physics of the target PDE via a composite, PINN-style loss:

L(θ)=λdataLdata+λicLic+λbcLbc+λrLr\mathcal{L}(\theta) = \lambda_\text{data} \mathcal{L}_\text{data} + \lambda_\text{ic} \mathcal{L}_\text{ic} + \lambda_\text{bc} \mathcal{L}_\text{bc} + \lambda_r \mathcal{L}_r

with data, initial-condition, boundary-condition, and PDE residual terms respectively.

  • Data Loss:

Ldata=1Ndi=1NdGθ(ui)(yi)si(yi)22\mathcal{L}_\text{data} = \frac{1}{N_d} \sum_{i=1}^{N_d} \left\| G_\theta(u_i)(y_i) - s_i(y_i) \right\|_2^2

  • Initial Condition Loss:

Lic=1Nicj=1NicGθ(uj)(yjic,0)s0(yjic)22\mathcal{L}_\text{ic} = \frac{1}{N_\text{ic}} \sum_{j=1}^{N_\text{ic}} \left\| G_\theta(u_j)(y_j^\text{ic,0}) - s_0(y_j^\text{ic}) \right\|_2^2

  • Boundary Condition Loss (Dirichlet):

Lbc=1Nbcj=1NbcGθ(uj)(yjbc)sbc(yjbc)22\mathcal{L}_\text{bc} = \frac{1}{N_\text{bc}} \sum_{j=1}^{N_\text{bc}} \left\| G_\theta(u_j)(y_j^\text{bc}) - s_\text{bc}(y_j^\text{bc}) \right\|_2^2

  • PDE Residual Loss:

Lr=1Npj=1NpR(uj,Gθ(uj))(yjphys)22\mathcal{L}_r = \frac{1}{N_p} \sum_{j=1}^{N_p} \left\| \mathcal{R}(u_j, G_\theta(u_j))(y_j^\text{phys}) \right\|_2^2

where R(u,s)\mathcal{R}(u, s) is the PDE residual; all required derivatives are computed via automatic differentiation. The hyperparameters λ\lambda_* are chosen per problem (Wu et al., 9 Oct 2025).

3. Training Protocol and Benchmark Problems

All benchmarks employ the Adam optimizer and train solely with physics-informed losses (no paired input–solution data beyond IC/BC). Three canonical testbeds illustrate the method:

  • Burgers’ Equation (1D): For ν{0.05,0.03,0.01}\nu \in \{0.05, 0.03, 0.01\}, with uu sampled from a Gaussian random field, Ntrain=1500N_\text{train} = 1500, Ntest=500N_\text{test} = 500 cases (101 time-snapshots each). Network: 4 CKAN layers (1×11\times1), n=4n=4, 100 units/layer. Baseline: 4×\times100 MLP (PI-DeepONet).
  • Eikonal Equation (2D): Domain [2,2]2[-2,2]^2, random circle boundaries. Network: 4 CKAN layers (2×22\times2), n=4n=4, 50 units/layer.
  • Diffusion–Reaction: D=k=0.01D = k = 0.01. IC/BC homogeneous, 5 layers of 50 units (CKAN 2×22\times2, n=4n=4 rational units).

No ground-truth data (beyond required IC/BC) is used—only the PINN composite loss guides learning (Wu et al., 9 Oct 2025).

4. Quantitative Performance and Expressivity

Across all benchmarks, PO-CKAN demonstrates marked improvements over PI-DeepONet and baseline PINN variants.

Problem / Metric PI-DeepONet Error PO-CKAN Error Improvement
Burgers' (ν=0.01\nu=0.01) 6.23×1026.23 \times 10^{-2} 3.21×1023.21 \times 10^{-2} ~48% reduction
Eikonal (2D) >1×101> 1 \times 10^{-1} 5.10×1035.10 \times 10^{-3} > 20×\times lower
Diffusion-Reaction 5.19×1035.19 \times 10^{-3} 2.58×1032.58 \times 10^{-3} >50% reduction
Fractional PDE 1.32×1011.32 \times 10^{-1} 2.54×1022.54 \times 10^{-2} ~80% reduction

Test-loss for Eikonal converges two orders of magnitude lower; max absolute error (0.016 vs. 2.5) is similarly improved. Results generalize across parametric variations, input regularities, and geometric complexity (Wu et al., 9 Oct 2025).

PO-CKAN’s chunkwise rational structure yields:

  • Substantial parameter reduction compared to vanilla KANs (O(c2)O(c^2) rational base functions vs O(dindout)O(d_\text{in} d_\text{out})).
  • 10×10\times fewer FLOPs compared to B-spline KANs.
  • Enhanced convergence and representational capacity, evident in sharper shock capturing (Burgers’) and finer geometric detail (Eikonal).
  • Consistent outperformance over deep MLP or standard operator network baselines.

5. Advantages, Limitations, and Research Directions

Advantages

  • Parameter/FLOP Efficiency: CKAN’s chunkwise rational activations enable full connectivity with tractable scaling, supporting larger, deeper, or more expressive models.
  • Physical Consistency: Integrated PINN losses guarantee that solutions respect the underlying PDE constraints (no ground-truth data required except at boundaries/initialization).
  • Generalization: PO-CKAN is effective across diverse PDE families, including parametric, nonlinear, and fractional-order equations.

Limitations

  • Adaptive Complexity: Fixed chunking and rational order may limit local adaptivity. Adaptive strategies for chunk granularity or rational degree are needed for sharp local features or heterogeneous domains.
  • Geometry: Extension to arbitrary or highly complex domains requires additional machinery (e.g., domain decomposition, meshless collocation, XPINNs).
  • Uncertainty Quantification: No native quantification of prediction uncertainty; prospective Bayesian or ensemble PINN extensions would address this.
  • Scalability: While chunked, the architecture still entails nontrivial computational cost for very high-dimensional problems; exploiting chunk-based parallelism (e.g., on multi-GPU/TPU) can further scale inference in large domains.

Future Research Directions

  • Adaptive CKANs with local refinement.
  • Integration with meshless or domain-decomposition methods for complex domains.
  • Bayesian extensions for uncertainty quantification in sparse/noisy data regimes.
  • Hardware acceleration leveraging CKAN’s chunk structure for large-scale, real-time operator learning (Wu et al., 9 Oct 2025).

The PO-CKAN framework extends and complements advances in physics-informed neural operator learning. AC-PKAN incorporates attention and Chebyshev polynomial bases to address expressivity and rank-collapse syndrome, using wavelet-activated MLPs with internal and external attention (Residue-Gradient Attention). This preserves a full-rank Jacobian and guarantees universal PDE approximation, albeit at higher computational cost than plain MLPs (Zhang et al., 13 May 2025).

A plausible implication is that PO-CKAN’s chunkwise rational activation design, when combined with advanced weighting schemes or alternative polynomial bases (e.g., orthogonal Chebyshev/Jacobi systems), could further enhance accuracy, stability, and scalability. This suggests a promising deployment template for operator learning in data-sparse regimes and complex, real-world engineering workflows (Zhang et al., 13 May 2025).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to PO-CKAN.