Generative Models for Itô Processes
- Generative models for Itô processes are methods that learn drift and diffusion functions directly from data to accurately simulate stochastic differential equations.
- Key techniques include trajectory flow-matching, neural jump ODEs, GAN-based simulation, and score-based diffusion, each with specialized training schemes.
- These approaches effectively address challenges like irregular sampling, missing data, and path dependence, improving probabilistic forecasting and simulation fidelity.
Generative models for Itô processes aim to construct data-driven procedures that generate sample paths of stochastic differential equations (SDEs) of the Itô form,
by learning the (potentially path-dependent) drift and diffusion from observational data. This problem is central to time series modeling, stochastic simulation, and probabilistic forecasting, where nonparametric and machine learning-based generative models are applied to both continuous SDEs and related jump processes. Recent advances include trajectory flow-matching, neural jump ODEs, adversarial approaches using GANs, and SDE-based diffusion models. These frameworks are distinguished by their approach to model parameterization, training objectives, convergence guarantees, and their adaptability to irregularly sampled and partially observed time series.
1. Formulations and Objectives in Generative Modeling of Itô Processes
Generative modeling for Itô SDEs begins with the task of specifying a family of stochastic processes whose finite-dimensional distributions match those of the observed data. The goal is to learn suitable drift and diffusion functions from samples, enabling simulation of new paths consistent with the underlying dynamics.
Key formulations include:
- Trajectory generator SDEs: Construct explicit, parameterized SDEs that interpolate between data endpoints with analytically tractable marginals. For endpoints over , define time-varying means and variances,
and the bridge marginal
The generator SDE with drift and constant diffusion computes marginals exactly as by solving
To address randomness in the conditional law, the drift is averaged over the posterior of the endpoint, leading to a target drift 0, which is then learned by a neural network generator (Jahn et al., 29 May 2025).
- Neural Jump ODEs (NJODEs): Parameterize the time series with an ODE in latent state 1 with resets (jumps) at observation points. The output 2 estimates 3 and higher moments, allowing learning of drift and diffusion from irregular or incomplete data, with provable convergence to the true SDE coefficients in the limit of dense data and large networks (Crowell et al., 3 Oct 2025).
- Score-based diffusion models: Model the process as a time-inhomogeneous diffusion and learn the score (gradient of log likelihood) model 4, using reversal SDEs for sampling. The Itô density estimator enables efficient log-likelihood estimation, facilitating the combination (“superposition”) of multiple pre-trained models at inference (Skreta et al., 2024).
These approaches typically enforce that the learned stochastic process has the appropriate finite-dimensional marginals or conditional laws, ensuring fidelity to the data distribution.
2. Algorithmic Approaches and Training Schemes
Approaches differ in their training objectives, algorithms for parameter learning, and path generation procedures:
- Trajectory Flow-Matching: The drift network 5 is trained to minimize the 6 distance to the oracle drift 7 under the joint bridge law 8:
9
For jump processes, the generator’s kernel is matched via a closed-form Kullback-Leibler divergence, exploiting Gaussian parameterizations (Jahn et al., 29 May 2025).
- NJODE Regression Training: NJODEs use regression objectives for moment estimation. For target process 0, one minimizes
1
and for direct coefficient learning, a noise-adapted one-sided loss. Sampling proceeds via estimated drifts and diffusions in an Euler scheme, yielding sample paths that converge to the true law (Crowell et al., 3 Oct 2025).
- GAN-based SDE Simulation: Standard GANs approximate the marginal distributions (“weak approximation”); conditional GANs (cGANs) learn one-step transitions 2 by conditioning. Supervised GANs further incorporate the noise input 3 into the discriminator and employ an explicit 4 penalty, enforcing path-wise (“strong”) approximation and near-bijectivity of the generator mapping (Rhijn et al., 2021).
- Score-based Diffusion (SuperDiff): Reverse SDEs are integrated using estimated scores. The Itô density estimator computes the time evolution of log densities during inference without expensive inner divergence computations, enabling scalable composition of models (Skreta et al., 2024).
Table 1 summarizes representative training objectives.
| Framework | Objective Type | Main Training Loss |
|---|---|---|
| Trajectory Matching SDE | Drift regression (5) | 6 |
| NJODE | Predictive regression/MSE | 7 for 8 |
| GAN (supervised) | Adversarial + supervised regularization | GAN loss 9 0 w.r.t. inverse-CDF |
| Score-based Diffusion | Score regression | Score-matching or likelihood-based |
3. Handling Irregular Sampling, Missing Data, and Path Dependence
Irregular time series, missing values, and path-dependent dynamics are ubiquitous in practice.
- Memory-based Conditioning: Trajectory generator networks and NJODEs admit finite “memory” architectures, where the generator has access to the most recent 1 data points or the entire observed path history 2. Under weak Lipschitz continuity conditions, this autoregressive conditioning ensures reconstructibility of the joint law, up to controlled error (Jahn et al., 29 May 2025, Crowell et al., 3 Oct 2025).
- Coordinate-wise Missingness: NJODEs encode observation masks 3 in their regression objective via projections, allowing robust estimation even with incomplete data (Crowell et al., 3 Oct 2025).
- Path Dependence: Both NJODEs and flow-matching SDEs can model path-dependent 4, 5 since the neural parameterizations can use dynamic representations of the observation history (Jahn et al., 29 May 2025, Crowell et al., 3 Oct 2025).
- No Solver–Backpropagation: Trajectory flow-matching, NJODEs, and score-based diffusions decouple generator parameter training from simulation; backpropagation through ODE/SDE solvers is avoided. This reduces computational cost and increases robustness to irregular or sparse sampling (Jahn et al., 29 May 2025, Crowell et al., 3 Oct 2025).
4. Model Composition and Superposition
Advances in model interoperability enable the mixing and matching of pretrained generative models at inference.
- Superposition of Diffusion Models: The SuperDiff framework enables composition of multiple diffusion models by constructing convex combinations of their density-weighted drift (score) fields. For models with marginal densities 6 and drift fields 7, the composite drift
8
is used to sample from the mixture law. Logical-OR and logical-AND combinations are instantiated by choice or optimization of the weights. The log-density Itô estimator underpins the computational tractability of this approach, as all required statistics can be estimated without backpropagating through learned network divergences (Skreta et al., 2024).
- Empirical Implications: This enables flexible, inference-only fusion of models for multi-modal generation, conditional composition, and enhanced diversity, demonstrated across image and protein structure data (Skreta et al., 2024).
5. Empirical Comparisons and Convergence Properties
Several empirical benchmarks compare model classes and demonstrate convergence behavior.
- Trajectory Generator Matching: Flow-matching SDEs and jump process generators trained on irregular, real-world data produce sample paths with correct marginal distributions and accurate interpolation between data points, without inner loop simulations (Jahn et al., 29 May 2025).
- NJODE Convergence: Under model and sampling conditions, NJODE-trained drift and diffusion coefficients converge in 9 to the true conditionals, and generated paths converge in law to the data-generating Itô process. Experiments on geometric Brownian motion and Ornstein–Uhlenbeck processes confirm accurate moment and marginal recovery under varying data sparsity (Crowell et al., 3 Oct 2025).
- GAN-based Simulation: Supervised GANs outperform Euler and Milstein schemes in strong error for one-dimensional SDEs on coarse time grids, while standard GANs may produce poor path-wise approximations despite matching marginal laws (Rhijn et al., 2021). For geometric Brownian motion and CIR, supervised GANs exhibit strong errors significantly below the classical numerical solvers for 0.
- Score-based Diffusion and Superposition: Mixture and AND/OR composite models (SuperDiff) maintain or improve distributional and sample diversity relative to single-model baselines in vision and protein backbone generation (Skreta et al., 2024).
Table 2 presents representative strong errors (1 at 2) reported for various methods (Rhijn et al., 2021).
| Process | Euler | Milstein | Supervised GAN |
|---|---|---|---|
| GBM | 0.12 | 0.08 | 0.05 |
| CIR | 0.07 | 0.05 | 0.03 |
6. Limitations, Assumptions, and Future Directions
Assumptions and limitations differ across approaches:
- Process Regularity: Generator matching and NJODE methods require right-continuity with left limits and finite second moments for processes; further, Lipschitz conditions on conditional transition laws ensure error control (Jahn et al., 29 May 2025, Crowell et al., 3 Oct 2025).
- Adversarial Training: GAN-based methods may suffer from lack of equilibria and instability in adversarial optimization. They require either exact simulation pairs or sampled conditional distributions for supervision (Rhijn et al., 2021).
- Data Requirements: NJODE frameworks assume access to many independent trajectories and known observation times; adversarial and diffusion-based models can work with single long time series but may need extensive sampling (Crowell et al., 3 Oct 2025).
- Model Specialization: Score-based diffusion frameworks target diffusion-like SDEs; extending superposition to non-Ornstein–Uhlenbeck and general Itô processes is a topic of ongoing research (Skreta et al., 2024).
- Irregular Sampling: All state-of-the-art models incorporate mechanisms (autoregressive memory, input masks, or path-dependent conditioning) for dealing with irregular and missing observations.
A plausible implication is that further research will focus on bridging these methods, developing parameter-efficient transfer and fine-tuning, and establishing unified generative frameworks for SDEs with both diffusion and jump components under practical data constraints.
7. Comparative Summary
Generative models for Itô processes can be classified as follows:
| Approach | Parametric Family | Training Principle | Data Adaptivity | Convergence Type |
|---|---|---|---|---|
| Trajectory Matching SDE | Analytic bridges, NN | Flow-matching (3) | Irregular/autoregressive | Marginal law recovery |
| NJODE | Latent ODE+NN, jumps | Regression (MSE) | Highly flexible | 4 coefficient + law |
| Supervised GAN | NN generator | Adversarial + 5 | Conditional, 1D | Strong/weak (empirical) |
| Score-based Diffusion | Neural score models | Likelihood/score | Inference composition | Score marginal & mixture |
Each paradigm provides concrete mechanisms for generative modeling of Itô processes, with varying emphasis on sample-path accuracy, model interpretability, and computational efficiency. Foundational contributions by Jahn et al. (Jahn et al., 29 May 2025), Crowell–Krach–Teichmann (Crowell et al., 3 Oct 2025), and others establish a rigorous basis for the ongoing development and application of these methods to complex, irregular, and high-dimensional time series.