Physics-Informed Chebyshev Neural Operator

Updated 9 February 2026

CPNO is a mesh-free operator learning framework that leverages Chebyshev polynomial bases for enhanced stability and spectral accuracy in solving parameterized PDEs.
It incorporates a parameter-dependent modulation mechanism within a deep neural network to effectively integrate heterogeneous, multi-scale, and high-frequency PDE data.
Empirical results on benchmarks demonstrate state-of-the-art performance with rapid convergence and reduced errors in complex scenarios such as transonic airfoil flows.

The Physics-Informed Chebyshev Polynomial Neural Operator (CPNO) is a mesh-free deep learning framework devised to solve parameterized partial differential equations (PDEs). It replaces monomial or Fourier-type feature expansions with Chebyshev polynomial bases, thereby ensuring enhanced numerical stability, spectral convergence, and robustness in physics-informed learning settings. CPNO introduces a parameter-dependent modulation mechanism to seamlessly integrate heterogeneous PDE data, making it particularly effective for multi-scale, high-frequency, and parametric PDE operator learning, including applications in complex geometries such as transonic airfoil flow (Chen et al., 2 Feb 2026).

1. Chebyshev Polynomial Expansions and Functional Representation

CPNO employs the Chebyshev polynomials of the first kind, $T_n(x)$ , for stable and efficient representation of functions defined on $[-1,1]$ . Formally,

$T_0(x)=1,\quad T_1(x)=x,\quad T_{n+1}(x)=2\,x\,T_n(x) - T_{n-1}(x),\quad x\in[-1,1].$

These polynomials are orthogonal under the weight $(1-x^2)^{-1/2}$ , and satisfy a uniform bound $|T_n(x)|\le 1$ , making them suitable for constructing spectral approximations. For a sufficiently smooth function $u(x)$ ,

$u(x)\approx \sum_{k=0}^N c_k\,T_k(x),$

with coefficients $c_k$ given by a weighted inner product. For multi-dimensional domains (including time and parameters), tensor-product bases are used, ensuring that feature representations encompass each spatial/temporal and parametric input (Chen et al., 2 Feb 2026, Mostajeran et al., 6 Jan 2025).

2. CPNO Architectural Design

CPNO’s core is a $q$ -layer synthesis network that recursively builds polynomial-valued feature maps. The initial feature vector $\mathbf h^{(0)}(x, t)$ is a concatenation of Chebyshev features evaluated coordinate-wise (inputs normalized to $[-1,1]$ 0). Each subsequent layer performs

$[-1,1]$ 1

where $[-1,1]$ 2 is a learnable linear spectral projection, $[-1,1]$ 3 is Hadamard (elementwise) product (effectively raising polynomial degree), and $[-1,1]$ 4 are parametric modulation vectors output by mapping networks ingesting PDE parameters $[-1,1]$ 5. The nonlinearity $[-1,1]$ 6 is set to GELU or tanh to ensure smoothness and spectral fidelity. After $[-1,1]$ 7 layers, a final affine map outputs the PDE solution or its expansion coefficients.

Pseudocode summarizing the network forward pass: $u(x)$ 6 Each mapping network $[-1,1]$ 8 is a small MLP (2 layers, width 32 in experiments), ingesting finitely-encoded PDE parameters (e.g., Chebyshev coefficients of coefficient functions or scalars like viscosity) (Chen et al., 2 Feb 2026).

3. Physics-Informed Loss Formulation

The network is trained by minimizing a composite loss: $[-1,1]$ 9

PDE Residual: $T_0(x)=1,\quad T_1(x)=x,\quad T_{n+1}(x)=2\,x\,T_n(x) - T_{n-1}(x),\quad x\in[-1,1].$ 0
Initial Condition: $T_0(x)=1,\quad T_1(x)=x,\quad T_{n+1}(x)=2\,x\,T_n(x) - T_{n-1}(x),\quad x\in[-1,1].$ 1
Boundary Condition: $T_0(x)=1,\quad T_1(x)=x,\quad T_{n+1}(x)=2\,x\,T_n(x) - T_{n-1}(x),\quad x\in[-1,1].$ 2
Data Fit (optional): $T_0(x)=1,\quad T_1(x)=x,\quad T_{n+1}(x)=2\,x\,T_n(x) - T_{n-1}(x),\quad x\in[-1,1].$ 3

Hyperparameters $T_0(x)=1,\quad T_1(x)=x,\quad T_{n+1}(x)=2\,x\,T_n(x) - T_{n-1}(x),\quad x\in[-1,1].$ 4 balance the loss components; typical values set PDE, initial, and boundary terms to $T_0(x)=1,\quad T_1(x)=x,\quad T_{n+1}(x)=2\,x\,T_n(x) - T_{n-1}(x),\quad x\in[-1,1].$ 5, data term (when used) to $T_0(x)=1,\quad T_1(x)=x,\quad T_{n+1}(x)=2\,x\,T_n(x) - T_{n-1}(x),\quad x\in[-1,1].$ 6 (Chen et al., 2 Feb 2026).

4. Theoretical Properties and Numerical Conditioning

CPNO’s use of the Chebyshev basis addresses two critical operator learning challenges:

Spectral Bias Mitigation: The Chebyshev expansion exhibits near-minimax uniform approximation error. For $T_0(x)=1,\quad T_1(x)=x,\quad T_{n+1}(x)=2\,x\,T_n(x) - T_{n-1}(x),\quad x\in[-1,1].$ 7 analytic on a Bernstein ellipse, the best $T_0(x)=1,\quad T_1(x)=x,\quad T_{n+1}(x)=2\,x\,T_n(x) - T_{n-1}(x),\quad x\in[-1,1].$ 8-term Chebyshev truncation satisfies

$T_0(x)=1,\quad T_1(x)=x,\quad T_{n+1}(x)=2\,x\,T_n(x) - T_{n-1}(x),\quad x\in[-1,1].$ 9

for some $(1-x^2)^{-1/2}$ 0 and constant $(1-x^2)^{-1/2}$ 1.

Stability and Conditioning: The Lebesgue constant for Chebyshev nodes grows only logarithmically,

$(1-x^2)^{-1/2}$ 2

and the condition number of the Chebyshev-gram matrix $(1-x^2)^{-1/2}$ 3, dramatically lower than $(1-x^2)^{-1/2}$ 4 for monomials. This ensures stable gradient propagation and robust training even at high polynomial orders (Chen et al., 2 Feb 2026, Mostajeran et al., 6 Jan 2025).

5. Benchmark Performance and Empirical Results

CPNO achieves state-of-the-art accuracy and convergence rates on a wide range of parameterized PDE benchmarks:

Burgers’ equation: $(1-x^2)^{-1/2}$ 5 error $(1-x^2)^{-1/2}$ 6 (zero-shot, no data), outperforming PI-DeepONet ( $(1-x^2)^{-1/2}$ 7) and HyperPINNs ( $(1-x^2)^{-1/2}$ 8).
Allen–Cahn equation: $(1-x^2)^{-1/2}$ 9 (zero-shot).
Diffusion–Reaction: $|T_n(x)|\le 1$ 0 (zero-shot).
2D vorticity–Navier–Stokes: $|T_n(x)|\le 1$ 1 (zero-shot).
Few-shot regime: Additional three solution snapshots reduce errors further (e.g., $|T_n(x)|\le 1$ 2 for Allen–Cahn).

The framework demonstrates rapid convergence: errors $|T_n(x)|\le 1$ 3 are typically attained in less than 5000 epochs, whereas baseline neural operator models require over 20,000 epochs. In frequency analysis, CPNO captures high-wavenumber content ( $|T_n(x)|\le 1$ 4) not resolved by MLP-based alternatives.

In the transonic airfoil flow experiment (parameterized complex geometry), CPNO with Chebyshev order 10, network depth 12, and degree-16 encoding achieves L1 errors on $|T_n(x)|\le 1$ 5 of mean $|T_n(x)|\le 1$ 6 and max $|T_n(x)|\le 1$ 7, demonstrating its capacity for operator learning on challenging fluid dynamics problems (Chen et al., 2 Feb 2026).

6. Implementation Best Practices

Recommended configuration for CPNO includes:

Chebyshev polynomial order $|T_n(x)|\le 1$ 8 for smooth fields; $|T_n(x)|\le 1$ 9 for sharp layers or shocks.
Depth $u(x)$ 0 layers; hidden dimension $u(x)$ 1 (typically $u(x)$ 2).
Activation: GELU or tanh; ReLU is discouraged due to nondifferentiability affecting spectral accuracy.
Adam optimizer with learning rate warmed up to $u(x)$ 3 and multiplicative decay.
Minibatch of $u(x)$ 4 uniformly-sampled collocation points per step.
Normalization of each physical domain to $u(x)$ 5 to ensure polynomial features remain well-scaled.
Mapping networks for parameter-dependent modulation are kept shallow and narrow (2–3 layers, width 16–32) to control overfitting (Chen et al., 2 Feb 2026).

7. Extensions, Limitations, and Context

CPNO’s foundation explicitly incorporates the Chebyshev basis, parameter modulation, and mesh-free operator learning, distinguishing it from monomial-MLPs, Fourier-based, or vanilla Galerkin and KAN variants (Guo et al., 2024, Zhang et al., 13 May 2025, Mostajeran et al., 6 Jan 2025). While CPNO is robust to multi-scale and high-frequency phenomena, practical limitations include the expense of high-dimensional tensor-product expansions and domain normalization strategies for non-rectilinear geometries. Potential future work involves sparse-grid Chebyshev representations, adaptive network depth, and hybridization with attention-based enhancements as in recent Chebyshev–KAN models (Zhang et al., 13 May 2025).

CPNO situates itself within the rapidly evolving landscape of neural operator design, providing a theoretically principled and empirically validated approach to solving parametric, time-dependent, and nonlinear PDEs with superior accuracy, efficiency, and training robustness (Chen et al., 2 Feb 2026).