Generative Models for Itô Processes

Updated 2 May 2026

Generative models for Itô processes are methods that learn drift and diffusion functions directly from data to accurately simulate stochastic differential equations.
Key techniques include trajectory flow-matching, neural jump ODEs, GAN-based simulation, and score-based diffusion, each with specialized training schemes.
These approaches effectively address challenges like irregular sampling, missing data, and path dependence, improving probabilistic forecasting and simulation fidelity.

Generative models for Itô processes aim to construct data-driven procedures that generate sample paths of stochastic differential equations (SDEs) of the Itô form,

$dX_t = \mu_t(X_{\cdot\wedge t})\,dt + \sigma_t(X_{\cdot\wedge t})\,dW_t,$

by learning the (potentially path-dependent) drift $\mu_t$ and diffusion $\sigma_t$ from observational data. This problem is central to time series modeling, stochastic simulation, and probabilistic forecasting, where nonparametric and machine learning-based generative models are applied to both continuous SDEs and related jump processes. Recent advances include trajectory flow-matching, neural jump ODEs, adversarial approaches using GANs, and SDE-based diffusion models. These frameworks are distinguished by their approach to model parameterization, training objectives, convergence guarantees, and their adaptability to irregularly sampled and partially observed time series.

1. Formulations and Objectives in Generative Modeling of Itô Processes

Generative modeling for Itô SDEs begins with the task of specifying a family of stochastic processes whose finite-dimensional distributions match those of the observed data. The goal is to learn suitable drift and diffusion functions from samples, enabling simulation of new paths consistent with the underlying dynamics.

Key formulations include:

Trajectory generator SDEs: Construct explicit, parameterized SDEs that interpolate between data endpoints with analytically tractable marginals. For endpoints $(x_0, x_1)\in\mathbb R^d$ over $[0,1]$ , define time-varying means and variances,

$m_t=(1-t)x_0 + t\,x_1,\quad \tau_t=\eta^2 t(1-t)+\rho^2,$

and the bridge marginal

$P_t(\cdot|x_0,x_1)=\mathcal N(m_t, \tau_t I_d).$

The generator SDE with drift $u_t^{x_0,x_1}(x)$ and constant diffusion computes marginals exactly as $P_t$ by solving

$dY_t = u_t^{x_0,x_1}(Y_t)dt + \eta\,dW_t, \quad u_t^{x_0,x_1}(x) = x_1 - x_0 - (x-m_t)\frac{\eta^2 t}{\tau_t} \,.$

To address randomness in the conditional law, the drift is averaged over the posterior of the endpoint, leading to a target drift $\mu_t$ 0, which is then learned by a neural network generator (Jahn et al., 29 May 2025).

Neural Jump ODEs (NJODEs): Parameterize the time series with an ODE in latent state $\mu_t$ 1 with resets (jumps) at observation points. The output $\mu_t$ 2 estimates $\mu_t$ 3 and higher moments, allowing learning of drift and diffusion from irregular or incomplete data, with provable convergence to the true SDE coefficients in the limit of dense data and large networks (Crowell et al., 3 Oct 2025).
Score-based diffusion models: Model the process as a time-inhomogeneous diffusion and learn the score (gradient of log likelihood) model $\mu_t$ 4, using reversal SDEs for sampling. The Itô density estimator enables efficient log-likelihood estimation, facilitating the combination (“superposition”) of multiple pre-trained models at inference (Skreta et al., 2024).

These approaches typically enforce that the learned stochastic process has the appropriate finite-dimensional marginals or conditional laws, ensuring fidelity to the data distribution.

2. Algorithmic Approaches and Training Schemes

Approaches differ in their training objectives, algorithms for parameter learning, and path generation procedures:

Trajectory Flow-Matching: The drift network $\mu_t$ 5 is trained to minimize the $\mu_t$ 6 distance to the oracle drift $\mu_t$ 7 under the joint bridge law $\mu_t$ 8:

$\mu_t$ 9

For jump processes, the generator’s kernel is matched via a closed-form Kullback-Leibler divergence, exploiting Gaussian parameterizations (Jahn et al., 29 May 2025).

NJODE Regression Training: NJODEs use regression objectives for moment estimation. For target process $\sigma_t$ 0, one minimizes

$\sigma_t$ 1

and for direct coefficient learning, a noise-adapted one-sided loss. Sampling proceeds via estimated drifts and diffusions in an Euler scheme, yielding sample paths that converge to the true law (Crowell et al., 3 Oct 2025).

GAN-based SDE Simulation: Standard GANs approximate the marginal distributions (“weak approximation”); conditional GANs (cGANs) learn one-step transitions $\sigma_t$ 2 by conditioning. Supervised GANs further incorporate the noise input $\sigma_t$ 3 into the discriminator and employ an explicit $\sigma_t$ 4 penalty, enforcing path-wise (“strong”) approximation and near-bijectivity of the generator mapping (Rhijn et al., 2021).
Score-based Diffusion (SuperDiff): Reverse SDEs are integrated using estimated scores. The Itô density estimator computes the time evolution of log densities during inference without expensive inner divergence computations, enabling scalable composition of models (Skreta et al., 2024).

Table 1 summarizes representative training objectives.

Framework	Objective Type	Main Training Loss
Trajectory Matching SDE	Drift regression ( $\sigma_t$ 5)	$\sigma_t$ 6
NJODE	Predictive regression/MSE	$\sigma_t$ 7 for $\sigma_t$ 8
GAN (supervised)	Adversarial + supervised regularization	GAN loss $\sigma_t$ 9 $(x_0, x_1)\in\mathbb R^d$ 0 w.r.t. inverse-CDF
Score-based Diffusion	Score regression	Score-matching or likelihood-based

3. Handling Irregular Sampling, Missing Data, and Path Dependence

Irregular time series, missing values, and path-dependent dynamics are ubiquitous in practice.

Memory-based Conditioning: Trajectory generator networks and NJODEs admit finite “memory” architectures, where the generator has access to the most recent $(x_0, x_1)\in\mathbb R^d$ 1 data points or the entire observed path history $(x_0, x_1)\in\mathbb R^d$ 2. Under weak Lipschitz continuity conditions, this autoregressive conditioning ensures reconstructibility of the joint law, up to controlled error (Jahn et al., 29 May 2025, Crowell et al., 3 Oct 2025).
Coordinate-wise Missingness: NJODEs encode observation masks $(x_0, x_1)\in\mathbb R^d$ 3 in their regression objective via projections, allowing robust estimation even with incomplete data (Crowell et al., 3 Oct 2025).
Path Dependence: Both NJODEs and flow-matching SDEs can model path-dependent $(x_0, x_1)\in\mathbb R^d$ 4, $(x_0, x_1)\in\mathbb R^d$ 5 since the neural parameterizations can use dynamic representations of the observation history (Jahn et al., 29 May 2025, Crowell et al., 3 Oct 2025).
No Solver–Backpropagation: Trajectory flow-matching, NJODEs, and score-based diffusions decouple generator parameter training from simulation; backpropagation through ODE/SDE solvers is avoided. This reduces computational cost and increases robustness to irregular or sparse sampling (Jahn et al., 29 May 2025, Crowell et al., 3 Oct 2025).

4. Model Composition and Superposition

Advances in model interoperability enable the mixing and matching of pretrained generative models at inference.

Superposition of Diffusion Models: The SuperDiff framework enables composition of multiple diffusion models by constructing convex combinations of their density-weighted drift (score) fields. For models with marginal densities $(x_0, x_1)\in\mathbb R^d$ 6 and drift fields $(x_0, x_1)\in\mathbb R^d$ 7, the composite drift

$(x_0, x_1)\in\mathbb R^d$ 8

is used to sample from the mixture law. Logical-OR and logical-AND combinations are instantiated by choice or optimization of the weights. The log-density Itô estimator underpins the computational tractability of this approach, as all required statistics can be estimated without backpropagating through learned network divergences (Skreta et al., 2024).

Empirical Implications: This enables flexible, inference-only fusion of models for multi-modal generation, conditional composition, and enhanced diversity, demonstrated across image and protein structure data (Skreta et al., 2024).

5. Empirical Comparisons and Convergence Properties

Several empirical benchmarks compare model classes and demonstrate convergence behavior.

Trajectory Generator Matching: Flow-matching SDEs and jump process generators trained on irregular, real-world data produce sample paths with correct marginal distributions and accurate interpolation between data points, without inner loop simulations (Jahn et al., 29 May 2025).
NJODE Convergence: Under model and sampling conditions, NJODE-trained drift and diffusion coefficients converge in $(x_0, x_1)\in\mathbb R^d$ 9 to the true conditionals, and generated paths converge in law to the data-generating Itô process. Experiments on geometric Brownian motion and Ornstein–Uhlenbeck processes confirm accurate moment and marginal recovery under varying data sparsity (Crowell et al., 3 Oct 2025).
GAN-based Simulation: Supervised GANs outperform Euler and Milstein schemes in strong error for one-dimensional SDEs on coarse time grids, while standard GANs may produce poor path-wise approximations despite matching marginal laws (Rhijn et al., 2021). For geometric Brownian motion and CIR, supervised GANs exhibit strong errors significantly below the classical numerical solvers for $[0,1]$ 0.
Score-based Diffusion and Superposition: Mixture and AND/OR composite models (SuperDiff) maintain or improve distributional and sample diversity relative to single-model baselines in vision and protein backbone generation (Skreta et al., 2024).

Table 2 presents representative strong errors ( $[0,1]$ 1 at $[0,1]$ 2) reported for various methods (Rhijn et al., 2021).

Process	Euler	Milstein	Supervised GAN
GBM	0.12	0.08	0.05
CIR	0.07	0.05	0.03

6. Limitations, Assumptions, and Future Directions

Assumptions and limitations differ across approaches:

Process Regularity: Generator matching and NJODE methods require right-continuity with left limits and finite second moments for processes; further, Lipschitz conditions on conditional transition laws ensure error control (Jahn et al., 29 May 2025, Crowell et al., 3 Oct 2025).
Adversarial Training: GAN-based methods may suffer from lack of equilibria and instability in adversarial optimization. They require either exact simulation pairs or sampled conditional distributions for supervision (Rhijn et al., 2021).
Data Requirements: NJODE frameworks assume access to many independent trajectories and known observation times; adversarial and diffusion-based models can work with single long time series but may need extensive sampling (Crowell et al., 3 Oct 2025).
Model Specialization: Score-based diffusion frameworks target diffusion-like SDEs; extending superposition to non-Ornstein–Uhlenbeck and general Itô processes is a topic of ongoing research (Skreta et al., 2024).
Irregular Sampling: All state-of-the-art models incorporate mechanisms (autoregressive memory, input masks, or path-dependent conditioning) for dealing with irregular and missing observations.

A plausible implication is that further research will focus on bridging these methods, developing parameter-efficient transfer and fine-tuning, and establishing unified generative frameworks for SDEs with both diffusion and jump components under practical data constraints.

7. Comparative Summary

Generative models for Itô processes can be classified as follows:

Approach	Parametric Family	Training Principle	Data Adaptivity	Convergence Type
Trajectory Matching SDE	Analytic bridges, NN	Flow-matching ( $[0,1]$ 3)	Irregular/autoregressive	Marginal law recovery
NJODE	Latent ODE+NN, jumps	Regression (MSE)	Highly flexible	$[0,1]$ 4 coefficient + law
Supervised GAN	NN generator	Adversarial + $[0,1]$ 5	Conditional, 1D	Strong/weak (empirical)
Score-based Diffusion	Neural score models	Likelihood/score	Inference composition	Score marginal & mixture

Each paradigm provides concrete mechanisms for generative modeling of Itô processes, with varying emphasis on sample-path accuracy, model interpretability, and computational efficiency. Foundational contributions by Jahn et al. (Jahn et al., 29 May 2025), Crowell–Krach–Teichmann (Crowell et al., 3 Oct 2025), and others establish a rigorous basis for the ongoing development and application of these methods to complex, irregular, and high-dimensional time series.

Markdown Report Issue Upgrade to Chat

References (4)

Trajectory Generator Matching for Time Series (2025)

Neural Jump ODEs as Generative Models (2025)

The Superposition of Diffusion Models Using the Itô Density Estimator (2024)

Monte Carlo Simulation of SDEs using GANs (2021)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Generative Models for Itô Processes.

Generative Models for Itô Processes

1. Formulations and Objectives in Generative Modeling of Itô Processes

2. Algorithmic Approaches and Training Schemes

3. Handling Irregular Sampling, Missing Data, and Path Dependence

4. Model Composition and Superposition

5. Empirical Comparisons and Convergence Properties

6. Limitations, Assumptions, and Future Directions

7. Comparative Summary

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Generative Models for Itô Processes

1. Formulations and Objectives in Generative Modeling of Itô Processes

2. Algorithmic Approaches and Training Schemes

3. Handling Irregular Sampling, Missing Data, and Path Dependence

4. Model Composition and Superposition

5. Empirical Comparisons and Convergence Properties

6. Limitations, Assumptions, and Future Directions

7. Comparative Summary

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research