KANHedge: BSDE Solver with Kolmogorov-Arnold Networks

Updated 19 January 2026

KANHedge is a BSDE-based solver that leverages Kolmogorov-Arnold Networks with learnable B-spline activations to offer accurate and smooth delta estimations in high-dimensional option pricing and hedging.
It replaces conventional MLPs in deep BSDE frameworks, reducing pricing errors and hedging risk (CVaR) by up to 9% in empirical studies on European and American basket options.
By employing spline activations that ensure smooth gradients, KANHedge enhances risk control and mitigates numerical instabilities associated with fixed activation functions in traditional PDE approaches.

KANHedge is a backward stochastic differential equation (BSDE)-based solver for high-dimensional option pricing and hedging. It replaces conventional Multi-Layer Perceptrons (MLPs) in the deep BSDE framework with Kolmogorov-Arnold Networks (KANs), which employ learnable B-spline activation functions. This architecture provides enhanced function approximation capabilities for continuous derivatives, specifically targeting improvements in hedging accuracy and risk control, particularly in high-dimensional settings where standard PDE-based methods are intractable due to the curse of dimensionality (Handal et al., 16 Jan 2026).

1. BSDE Formulation and Hedging Problem

The KANHedge methodology is rooted in the risk-neutral valuation paradigm for option pricing, where the price process $Y_t$ and corresponding hedging strategy $Z_t$ for a derivative contract are characterized by the BSDE: $Y_t = g(X_T) + \int_t^T f(s, X_s, Y_s, Z_s)\, ds - \int_t^T Z_s^\top \, dW_s\,,$ with $Y_t$ adapted to the market filtration and $Z_t \in \mathbb{R}^d$ representing the vector of deltas (the number of units of each underlying asset held at time $t$ ). Here, $X_t$ is the state process, $W_t$ denotes a $d$ -dimensional Brownian motion, $g$ is the payoff function, and $f$ is the driver specifying market structure.

The classical approach involves discretizing the underlying forward SDE and time grid, approximating $Y_t$ and $Z_t$ at each time step, and minimizing the expected squared loss on the terminal condition.

2. Deep BSDE Solvers with MLPs: Capabilities and Limitations

Standard deep BSDE solvers discretize the interval $[0, T]$ into $N$ steps and simulate the market paths via the forward SDE: $X_{t_{n+1}} = X_{t_n} + \mu(X_{t_n}, t_n) \Delta t + \Gamma(X_{t_n}, t_n) \Delta W_n.$ At each time $t_n$ , $Z_{t_n}$ is modeled as $\mathrm{MLP}_{\theta_n}(t_n, X_{t_n})$ , where the network typically has 3–5 hidden layers of width 100–500 and uses fixed activation functions such as ReLU, tanh, or SiLU. Training minimizes the Monte Carlo approximation of the quadratic terminal loss

$\mathcal{L}(\Theta) = \mathbb{E}[(Y_T^\Theta - g(X_T))^2] \approx \frac{1}{M}\sum_{i=1}^M (Y_T^{\Theta, (i)} - g(X_T^{(i)}))^2,$

where $M$ is the number of sampled paths.

Key limitations arise from the use of fixed activations: the resulting function approximations often exhibit irregular or non-smooth gradients, and direct MLP-based estimation of $Z_t$ can produce inaccurate or noisy delta trajectories, compromising hedging performance.

3. Kolmogorov–Arnold Networks (KANs) and B-Spline Activations

KANs are motivated by the Kolmogorov–Arnold representation theorem, which asserts that any continuous multivariate function can be expressed as a finite sum of univariate continuous functions: $f(x_1, \dots, x_d) = \sum_{q=0}^{2d} A_q \left( \sum_{p=1}^d \Phi_{q,p}(x_p) \right),$ where $A_q$ and $\Phi_{q,p}$ are continuous univariate functions.

KAN layers instantiate this by employing learnable B-spline activations for each edge: $\phi(u) = \sum_{k=0}^K c_k B_{k,p}(u),$ where $B_{k,p}$ are degree- $p$ B-spline basis functions and $c_k$ are trainable coefficients, yielding outputs smooth up to order $p-1$ . Each KAN layer maps $x \in \mathbb{R}^{n_{\mathrm{in}}}$ to $y \in \mathbb{R}^{n_{\mathrm{out}}}$ via

$y_j = \sum_{i=1}^{n_{\mathrm{in}}} w_{j,i} \, \phi_{j,i}(x_i) + b_j,$

for $j=1, \dots, n_{\mathrm{out}}$ . Stacking such layers produces highly expressive multivariate approximators with controlled and smooth derivatives, facilitating improved modeling of option deltas and higher-order Greeks.

4. KANHedge Model Architecture

In KANHedge, every MLP approximator of $Z_{t_n}$ in the standard deep BSDE solver is replaced by a corresponding KAN: $Z_{t_n} \approx \Psi_{\theta_n}(t_n, X_{t_n}) = A^{(n)}\left(\Phi^{(n)}(B^{(n)}[t_n, X_{t_n}] + b^{(n)})\right),$ where $B^{(n)}$ is an affine transformation, $\Phi^{(n)}$ applies univariate spline activations $\phi_{j,i}$ , and $A^{(n)}$ is an affine map over activations.

The joint parameter collection $\theta$ includes initial price $Y_0 = u_0$ and all KAN weights. Training minimizes a regularized loss: $L(\theta) = \mathbb{E}\Big[ |Y_0 - Y_0^\theta|^2 + \lambda \sum_{n=0}^{N-1} \| Z_{t_n} - Z_{t_n}^\theta \|^2 \Big],$ where $Y_0^\theta$ and $Z_{t_n}^\theta$ arise from a forward simulation under parameter $\theta$ .

5. Training Protocol

Training employs the Adam optimizer with $\beta_1 = 0.9$ , $\beta_2 = 0.999$ , and $\epsilon = 10^{-8}$ . The learning rate is initialized in $[10^{-4}, 10^{-3}]$ , decaying by a factor of 0.5 every 1,000 epochs. Model fitting utilizes batch sizes of $M=2,048$ Monte Carlo paths over $N=50$ –$100$ time steps for each trajectory. Spline activations are set to degree $p=3$ with $K+1=10$ knots. Optimization typically proceeds for 5,000–10,000 epochs until convergence of the loss function.

6. Empirical Results: Option Pricing and Hedging

Empirical studies examine European geometric basket calls at $d=10, 50, 100$ and American arithmetic basket puts at $d=8, 20$ (with $d=8$ as the main baseline). The primary evaluation metrics are:

Price error:

$\operatorname{PriceError} = 100 \, \frac{|u_0^{\mathrm{model}} - u_0^{\mathrm{ref}}|}{u_0^{\mathrm{ref}}}$

Hedging cost $\mathrm{CVaR}_{0.95}$ at the 95% quantile, normalized by $|u_0^{\mathrm{ref}}|$ :

$\mathrm{CVaR}_{0.95} = \frac{\mathbb{E}[ C \mid C \ge \mathrm{VaR}_{0.95} ]}{|u_0^{\mathrm{ref}}|}$

Table of representative results:

Setting	Price Error (MLP)	Price Error (KANHedge)	CVaR MLP	CVaR KANHedge
European basket ( $d=10$ )	0.31%	0.055%	1.438	1.409 (2%)
American basket ( $d=8$ )	≤0.6%	≤0.37%	1.66	1.52 (8.6%)

For European baskets ( $d=10$ ), KANHedge achieves pricing errors ≈0.055% versus ≈0.31% for MLP and reduces CVaR by ≈2.01%. Under high volatility or out-of-the-money conditions, MLP and KANHedge exhibit CVaR values around 1.90 and 1.88, respectively (a 1–4% improvement). For American baskets ( $d=8$ ), KANHedge achieves ≈8.6% lower CVaR. Across all strike, correlation, and volatility combinations, KANHedge consistently reduces hedging risk cost (CVaR) by 4–9% relative to MLPs.

7. Analysis, Advantages, and Limitations

The use of B-spline activations in KANs leads to smoother output with control of derivatives up to the second order, mitigating the occurrence of "gamma spikes" in the hedging profile. The Kolmogorov–Arnold decomposition directs the network to learn univariate transforms, which yields well-behaved partial derivatives and thus more accurate delta estimation. Directly modeling $Z_t$ using KAN further enhances the alignment of model-predicted deltas with analytical references, as demonstrated by near-perfect overlap with the Black-Scholes delta in the single-asset case, while MLP-based deltas deviate notably in the tails.

KAN layers introduce additional parameters per spline basis, incurring a memory and computational overhead of approximately 20–50% per training epoch. Hyperparameters such as knot placement and spline degree $p$ require careful tuning for optimal results.

Potential extensions include joint delta–gamma hedging by leveraging second derivatives of KAN outputs, integration of additional risk factors (e.g., Cox-Ingersoll-Ross stochastic interest rates), and multi-output KAN architectures for portfolio hedging with multiple payoff structures (Handal et al., 16 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (1)

KANHedge: Efficient Hedging of High-Dimensional Options Using Kolmogorov-Arnold Network-Based BSDE Solver (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to KANHedge.