Legendre-KAN: Polynomial Neural Framework
- Legendre-KAN is a neural architecture that combines Legendre polynomial bases with Kolmogorov–Arnold networks for spectral approximation and interpretability.
- It efficiently approximates fully nonlinear PDEs like the Monge–Ampère equation using adaptive sampling and robust input normalization.
- Its design delivers faster convergence and improved accuracy in high-dimensional scientific computing and optimal transport applications.
Legendre-KAN refers to a class of neural network architectures and numerical frameworks that leverage the synergy between Kolmogorov–Arnold Networks (KAN) and Legendre polynomial basis expansions. These methods are designed to exploit the theoretical and computational properties of Legendre polynomials—such as orthogonality, spectral approximation capabilities, and analytic structure—within the flexible and interpretable KAN paradigm. Legendre-KAN has been notably applied to the efficient solution of fully nonlinear Monge–Ampère equations with Dirichlet boundary conditions, as well as to supervised learning tasks and scientific computing problems where orthogonal expansions and robust normalizations are advantageous (Hu et al., 7 Apr 2025, Strawa et al., 16 Jul 2025). The defining features of Legendre-KAN are the representation of activations or functional components using weighted sums of Legendre polynomials and the incorporation of input normalization strategies specifically suited to the polynomial basis.
1. Theoretical Foundations and Network Architecture
The Legendre-Kolmogorov–Arnold Network method employs the structure of KAN, which draws upon the Kolmogorov–Arnold representation theorem. This theorem guarantees that any continuous multivariate function can be decomposed into superpositions of continuous univariate functions. In practice, KAN implements each layer as a composition of learned univariate activation functions, which in Legendre-KAN are parameterized as expansions in Legendre polynomials. This contrasts with standard multi-layer perceptrons (MLPs), for which the universal approximation theorem ensures expressivity but not necessarily interpretability or optimal basis adaptation.
In a typical Legendre-KAN, after mapping inputs to the interval (using, for instance, a transformation), each univariate activation is represented as
where is the Legendre polynomial of degree and is a trainable weight. The full network output is then constructed through nested compositions (layering), following
This layered, basis-adaptive construction improves both interpretability and convergence rate, supported by the orthogonality of Legendre polynomials on the target interval.
2. Mathematical Formulation for PDE Solution
Legendre-KAN has been applied to the numerical solution of the Monge–Ampère equation, a prototypical fully nonlinear PDE of the form:
The architecture approximates the unknown convex function by a Legendre polynomial expansion in each activation, after appropriate input mapping:
The overall network structure, with its layer-wise Legendre expansions, enables efficient global approximation and supports the convexity constraints required by the Monge–Ampère context.
Optimization proceeds by defining a suitable loss functional that penalizes deviation from the Monge–Ampère PDE and its boundary conditions, typically implemented via collocation at adaptively sampled points in the domain.
3. Numerical Results and Performance Characteristics
Legendre-KAN has been evaluated on benchmark problems for both smooth and singular solution regimes:
- For smooth solutions (e.g., ), Legendre-KAN yields lower maximum and average errors, as well as faster convergence, when compared to standard MLPs with similar or greater parameter count.
- In singular or piecewise situations (e.g., for small ), the use of Legendre basis functions enables concentrated approximation power near singularities, especially when coupled with adaptive sampling focused around high-error regions.
- High-dimensional cases (e.g., 3D and 4D Monge–Ampère equations) have been solved using Legendre-KAN with large, structured collocation grids, demonstrating scalability and systematic error reduction throughout the training process.
Empirical comparisons consistently report that Legendre-KAN attains superior accuracy and faster training convergence compared to MLPs, particularly when orthogonality and input normalization are appropriately handled (Hu et al., 7 Apr 2025).
4. Input Normalization: CDF and Quantile Methods
Proper normalization of inputs is crucial for aligning the data distribution with the assumptions underlying the Legendre polynomial basis. While conventional MinMax scaling linearly maps input data to , this can be sensitive to outliers and may not produce a uniformly distributed variable—the setting for which Legendre polynomials are orthogonal.
Recent research demonstrates that normalization by the cumulative distribution function (CDF), notably the Gaussian CDF,
produces nearly uniform quantiles on . This transformation is monotonic, smoothly compresses outliers (through the error function), and empirically enhances the effectiveness of low-degree Legendre expansions:
- Improved test accuracy (by up to 2 percentage points for polynomials of degree ) (Strawa et al., 16 Jul 2025)
- Faster convergence (up to fewer epochs)
- Uniformity of feature distribution throughout layers, as evidenced by activation histograms
The standard workflow with CDF normalization includes a preliminary LayerNorm (standardization), followed by the CDF mapping, after which Legendre features are computed and passed to the subsequent network layers.
5. Applications in Optimal Transport and Image Mapping
A concrete application of Legendre-KAN is to the solution of Monge–Ampère-type optimal transport problems in geometric image transformation scenarios. Here, the mapping is realized as the gradient of the learned convex potential , which transports pixel locations in a source image to correct geometric distortions (e.g., flattening a fisheye projection). The general equation solved is:
for a given source density over the image domain . The Legendre-KAN method enables robust and accurate warping of RGB-channel images, demonstrating stability under arbitrary boundary conditions and applicability to high-dimensional visual data (Hu et al., 7 Apr 2025).
6. Implementation Considerations and Future Directions
Key practical features of Legendre-KAN include:
- Use of Legendre polynomial expansions, requiring input data to be mapped to and, ideally, rendered approximately uniform via CDF normalization.
- Layer-wise compositional structure, with each layer comprising learned basis expansions, both facilitating and requiring care in gradient flow.
- Efficient performance for high-dimensional PDEs given proper sampling and network sizing; adaptive sampling strategies further benefit the handling of singularities.
Potential extensions include:
- Theoretical analysis of convergence rates for Legendre-KAN architectures, both in smooth and singular settings.
- Integration with spectral or finite element methods for problems with complex geometries.
- Applications to other nonlinear PDEs and to domains where tensorized Legendre-Galerkin methods have shown promise.
- Further refinement of normalization and pre-processing pipelines to maximize the effectiveness of the Legendre polynomial basis for structured data or measurements.
A plausible implication is that the pairing of Legendre polynomial representations with principled normalization schemes—such as CDF-to-quantile mapping—may become a standard paradigm in neural operator learning, especially for scientific computing tasks involving PDEs.
7. Broader Impact and Related Research
The Legendre-KAN methodology exemplifies a broader movement toward integrating classical mathematical analysis (orthogonal polynomials, spectral methods) with modern machine learning architectures. By adapting network design to known basis properties and distributional structure of data, Legendre-KAN bridges the conceptual gap between interpretability, expressivity, and computational efficiency.
Applications developed thus far cover nonlinear PDE solution, optimal transport, and enhanced supervised learning, with ongoing work addressing further domains such as stochastic process expansions and structured covariance modeling. Access to open-source implementations and reproducibility of numerical experiments are priorities in recent publications (Hu et al., 7 Apr 2025, Strawa et al., 16 Jul 2025), supporting community adoption and further methodological advancements.