Papers
Topics
Authors
Recent
2000 character limit reached

Co-evolutionary Calibration Framework

Updated 5 December 2025
  • The co-evolutionary calibration framework is a method that couples a genetic algorithm with a neural inverse map to rapidly calibrate the Heston stochastic volatility model.
  • It achieves efficient parameter space exploration by injecting neural predictions into the GA, reducing RMSE by approximately 15% in synthetic tests.
  • Hybrid data strategies, including GA-history and Latin hypercube sampling, are used to balance rapid adaptation with improved out-of-sample robustness.

A co-evolutionary calibration framework couples a genetic algorithm (GA) operating on model parameters with a neural network-based inverse map, which is trained to regress from option price surfaces to Heston model parameters. By allowing these components to co-adapt, the framework injects proposals from the learned neural inverse into the GA population, thereby accelerating exploration and exploitation of parameter space in the context of stochastic-volatility model calibration. The interplay between optimizer-driven samples and amortized inverse learning is central, with data-generation strategies profoundly affecting generalization and robustness (Gutierrez, 3 Dec 2025).

1. Heston Model and Calibration Objective

The co-evolutionary framework targets calibration of the Heston stochastic volatility model. Under the risk-neutral measure Q\mathbb Q, the asset price StS_t and variance vtv_t dynamics are governed by:

dSt=rStdt+vtStdW1,t, dvt=κ(λvt)dt+σvtdW2,t,dW1,W2t=ρdt,\begin{aligned} dS_t &= r\,S_t\,dt + \sqrt{v_t}\,S_t\,dW_{1,t}, \ dv_t &= \kappa(\lambda - v_t)\,dt + \sigma\,\sqrt{v_t}\,dW_{2,t},\qquad d\langle W_1,W_2\rangle_t = \rho\,dt, \end{aligned}

where the parameter vector is θH=(κ,λ,σ,ρ,v0)\theta_H = (\kappa, \lambda, \sigma, \rho, v_0), subject to constraints such as the Feller condition 2κλ>σ22\kappa\lambda > \sigma^2 and v0>0v_0 > 0. European option prices are computed semi-analytically via Fourier inversion:

C(θ)=S0Π1KerτΠ2,C(\theta) = S_0\,\Pi_1 - K e^{-r\tau}\,\Pi_2,

where Π1\Pi_1 and Π2\Pi_2 involve integrals over characteristic functions parameterized by (κ,λ,σ,ρ,v0)(\kappa, \lambda, \sigma, \rho, v_0). The calibration loss is typically mean squared error over observed market prices:

Lprice(θH)=1Mm=1M[C(θH;S0,r,τm,Km)Cmmkt]2.\mathcal{L}_{\rm price}(\theta_H) = \frac{1}{M} \sum_{m=1}^M \left[C(\theta_H;S_0,r,\tau_m,K_m) - C^{\rm mkt}_m\right]^2.

2. Genetic Algorithm Component

Within this framework, the GA maintains a population of candidate parameter vectors θΘH\theta \in \Theta_H. Evolutionary steps proceed as follows:

  • Population Initialization: N=50N=50 individuals sampled uniformly in Θ\Theta.
  • Selection and Elitism: Elitism fraction εGA=0.2\varepsilon_{\rm GA} = 0.2; elites are retained each generation.
  • Crossover and Mutation: Arithmetic crossover (px,GA=0.3p_{x,{\rm GA}}=0.3) and Gaussian mutation (probability pm,GA=0.2p_{m,{\rm GA}}=0.2, per-parameter flip μGA=0.1\mu_{\rm GA}=0.1, perturbation scale σmut\sigma_{\rm mut} proportional to parameter ranges) diversify the offspring.
  • GA Pseudocode initialize P_GA^{(0)} for g=0…G−1 do evaluate fitnesses F_GA select elites E create offspring via selection, crossover, mutation inject neural proposals (see Section 4) P_GA^{(g+1)} ← elites + offspring + injected end for Fitness is the (negated) calibration loss, FGA(θ)=Lprice(θ)F_{\text{GA}}(\theta) = -\mathcal{L}_{\text{price}}(\theta).

3. Neural Inverse Map Design

The neural inverse component consists of a population of neural networks NNi(s;Wi,Ai)\mathrm{NN}_i(s; W_i, A_i) mapping a flattened option price surface ss to predicted parameters θ^R5\hat{\theta}\in\mathbb{R}^5. Architectural choices include:

  • Input dimension equal to the surface grid size, output dimension $5$.
  • Hidden layers L{1,2,3}L\in\{1,2,3\}; widths h{16,32,64,128,256,512}h_\ell\in\{16,32,64,128,256,512\}.
  • Activations: {ReLU,Tanh,LeakyReLU,ELU}\{\mathrm{ReLU}, \mathrm{Tanh}, \mathrm{LeakyReLU}, \mathrm{ELU}\}.

Training minimizes

LNN(W,A;D)=1D(s,θ)DNN(s;W,A)θ22,L_{NN}(W, A; \mathcal{D}) = \frac{1}{|\mathcal{D}|} \sum_{(s, \theta)\in\mathcal{D}} \|\mathrm{NN}(s;W,A)-\theta\|_2^2,

with Adam optimizer (initial learning rate 0.001, exponential decay 0.9), batch size 64, and a 70/30 train/validation split.

Evolutionary operators are also employed:

  • Weight crossover (Wchild=12(Wp1+Wp2)W_{\rm child} = \frac12 (W_{p1} + W_{p2})) and mutation (μw=0.1\mu_w = 0.1 chance per parameter, perturbation scale σw=0.02\sigma_w = 0.02).
  • Architecture mutation: layer addition/removal, width and activation changes; survival fraction εNN=0.2\varepsilon_{\rm NN}=0.2.

4. Co-evolutionary Training and Data Exchange

At each generation, the GA and neural populations interact bidirectionally, resulting in a dynamic data-generation and injection regime:

  1. GA \rightarrow NN: The top εGAN\varepsilon_{\rm GA} N GA elites generate new neural training samples (surface,θ)(\text{surface}, \theta). These expand the NN’s dataset.
  2. NN Training/Evolution: Each NN retrains on the augmented set, with fitness metrics including dataset loss and direct calibration quality.
  3. NN \rightarrow GA: The top εinjN\varepsilon_{\rm inj} N networks are selected. Their predictions on the target surface, plus Gaussian noise, yield θinject\theta_{\rm inject}, which replace the worst portion of the GA population:

θ^i=NNi(flat(Starget)),θinject=θ^i+ζ,ζN(0,σinj2I)\hat{\theta}_i = \mathrm{NN}_i(\mathrm{flat}(S_{\rm target})),\quad \theta_{\rm inject} = \hat{\theta}_i + \zeta,\,\,\zeta\sim\mathcal{N}(0, \sigma_{\rm inj}^2 I)

  1. Population Updates: Elitism and evolutionary operators are applied to both GA and NN populations for the next generation.

5. Data Generation Strategies and Their Impact

Two contrasting dataset construction protocols shape generalization and overfitting characteristics:

  • GA-History Sampling: Collects only (sj,θj)(s_j, \theta_j) pairs from GA elites over generations, resulting in “target-specific sampling” highly concentrated near θ\theta^\star.
  • Latin Hypercube Sampling (LHS): Ensures space-filling coverage by partitioning each dimension into strata, sampling uniformly within them, and combining without replacement. This provides uniform, diverse parameter coverage.

The diversity and representativeness of datasets produced by these strategies determine overfitting and extrapolation properties:

Strategy In-sample Loss Train–Validation Gap Out-of-sample Stability
GA-History Rapidly low Widens with gen. Poor (overfits target)
LHS Higher Smaller Good (generalizes)

6. Empirical Results in Synthetic and Real Calibration Tasks

Empirical findings demonstrate the quantitative effects of co-evolutionary dynamics and data strategy:

  • Synthetic Targets: Co-evolutionary injection reduces RMSE faster than plain GA, achieving \sim15% lower error at ten generations (G=10G=10).
  • Time-to-threshold (TTT) vs. LBFGS: Over 20 trials, median TTT to match LBFGS RMSE is 26.4\approx 26.4 (GA generations), indicating comparable calibration speed.
  • Neural Architecture Drift: NN depth grows from 1.952.151.95 \rightarrow 2.15 layers, average nodes 192208192 \rightarrow 208, and maximum nodes 256384256 \rightarrow 384 (generations 2010020 \rightarrow 100), showing a trend toward higher capacity.
  • Learning Curves and Overfitting: Training MSE decreases steadily, but validation error plateaus and the gap widens as generations increase, confirming overfitting.
  • Strategy Comparison: GA-history datasets achieve near-zero training loss, but validation error remains large, indicating overfitting. LHS datasets result in higher training loss, but validation loss is closer, supporting better generalization.
  • Real SPX Calibration: On 152 quotes, the table below summarizes calibration loss and parameter errors over generations:
Gen Loss κ% λ% σ% ρ% v₀%
20 2.98e-4 400.6 42.6 17.8 27.9 25.7
40 2.07e-4 285.4 38.5 17.9 27.1 26.6
60 1.39e-4 153.7 34.7 21.7 25.3 16.8
80 1.13e-4 115.9 31.5 22.3 25.0 6.9
100 8.3e-5 58.2 27.5 22.5 24.7 6.2

GA-history–trained inverse models fit the target more tightly in-sample but this reflects target-dependent fitting, not a robust global inverse.

7. Practical Guidelines and Limitations

Analysis indicates:

  • Specialization and Overfitting: Co-evolutionary specialization arises since GA elites repeatedly sample near θ\theta^\star, shrinking diversity and causing the inverse model to memorize rather than learn a functional global inverse. LHS preserves broad coverage, trading in-sample fit for out-of-sample robustness.
  • Hybrid Data Regimens: A hybrid technique—combining initial LHS with periodic GA-history refinement—can balance rapid adaptation with preserved generalization. Maintaining a mixed buffer is recommended to avoid overfitting.
  • Production Recommendations: Amortized inverse models should be trained on datasets spanning the full plausible parameter space; exclusive reliance on target-specific or optimizer-guided data will reduce robustness and out-of-sample stability.
  • Algorithmic Tuning: Adjusting εinj\varepsilon_{\rm inj} (the NN-proposal injection rate) and regularizing neural network capacity can mitigate domination or capacity-drift effects.

In summary, the co-evolutionary calibration framework harnesses neural inverse seeding to accelerate GA-based Heston calibration, but its self-reinforcing data loop risks overfitting without explicit dataset diversification. Latin hypercube sampling remains an effective, easily implemented countermeasure to ensure model generality across unseen implied-volatility surfaces (Gutierrez, 3 Dec 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Co-evolutionary Calibration Framework.