Papers
Topics
Authors
Recent
2000 character limit reached

Path-Dependent Neural Jump ODEs

Updated 29 October 2025
  • The paper introduces a model that unifies continuous latent ODE flows with discrete jump resets to achieve L2-optimal online prediction under irregular observations.
  • It employs signature transforms for history encoding, enabling robust non-Markovian modeling of long-memory time series and chaotic dynamics.
  • Empirical results demonstrate superior forecasting, filtering, and generative performance over traditional ODE-based approaches in diverse application domains.

Path-Dependent Neural Jump Ordinary Differential Equations (PD-NJ-ODEs) are a class of neural sequence models tailored for continuous-time prediction, filtering, and generative modeling of dynamical systems exhibiting both continuous evolution and discrete event-driven discontinuities, typically under irregular and incomplete observational regimes. These models generalize classical Neural ODEs and Neural Jump ODEs to admit arbitrary path-dependent dynamics, permitting non-Markovianity, jumps, and optimal estimation properties, with strong theoretical guarantees.

1. Mathematical Foundations and Model Architecture

PD-NJ-ODEs combine piecewise-continuous latent ODE flows with discrete jumps triggered by stochastic or observed events, where the full past trajectory (rather than only the latest state) determines both the evolution and event intensities. Let (Xt)t[0,T](X_t)_{t \in [0,T]} denote a stochastic process, observed at random times tit_i with masks MiM_i indicating observed coordinates.

Define the filtration At=σ(Xti,j,ti,Mtitit;(Mti)j=1)\mathcal{A}_t = \sigma(X_{t_i, j}, t_i, M_{t_i} \mid t_i \leq t; (M_{t_i})_j = 1 ) encoding all available information up to tt. The L2L^2-optimal online predictor is the conditional expectation: X^t=E[XtAt].\hat{X}_t = \mathbb{E}[ X_t \mid \mathcal{A}_t ].

PD-NJ-ODE models the hidden state using a latent dynamics equation: ddtHt=fθ1(Ht,t,τ(t),history summary,X0,)\frac{d}{dt} H_t = f_{\theta_1}\left( H_{t-}, t, \tau(t), \text{history summary}, X_0, \cdots \right) with jump resets: Hti=ρθ2(Hti,ti,history summary,X0,),H_{t_i} = \rho_{\theta_2}( H_{t_i-}, t_i, \text{history summary}, X_0, \cdots ), where “history summary” is often implemented by the truncated path signature πm(X~τ(t)X0)\pi_m(\tilde{X}^{\leq \tau(t)} - X_0), or other universal path encodings (Krach et al., 2022). The output is Yt=gθ3(Ht)Y_t = g_{\theta_3}(H_t).

Jumps are triggered at observation times or stochastic event times, and the latent state is adjusted via a neural map.

2. Path-Dependence and Conditional Expectation

Unlike Markovian models, PD-NJ-ODEs allow the evolution to depend on the entire available path. Universal approximation is achieved by encoding the observed trajectory (possibly incomplete and irregular) with the signature transform: πm(X~τ(t)X0)\pi_m(\tilde{X}^{\leq \tau(t)} - X_0) which captures all algebraic information of the path up to truncation degree mm, ensuring that the model can represent any functional of the observed history (Krach et al., 2022).

Consequently, the model admits non-Markovian behaviors such as long-memory, self-excitation, delayed inhibition, and more general path-functional dependencies.

3. Training Objective and Theoretical Guarantees

The objective is to minimize empirical risk aggregating the squared error at observed times, focusing on pre-jump predictions for noisy observations: Ψnoisy(Y)=E[1ni=1nMi(OtiYti)22]\Psi_\text{noisy}(Y) = \mathbb{E}\left[ \frac{1}{n} \sum_{i=1}^n \| M_i \odot (O_{t_i} - Y_{t_i-}) \|_2^2 \right] where OtiO_{t_i} denotes noisy observations.

For noiseless or complete observations, the classical NJ-ODE loss is used: Ψ(Y)=E[1ni=1n(XtiYti2+YtiYti2)2]\Psi(Y) = \mathbb{E}\left[\frac{1}{n} \sum_{i=1}^n ( \| X_{t_i} - Y_{t_i} \|_2 + \| Y_{t_i} - Y_{t_i-} \|_2 )^2 \right]

Under mild regularity and boundedness, minimization yields convergence in L2L^2 to the optimal predictor X^t=E[XtAt]\hat{X}_t = \mathbb{E}[ X_t \mid \mathcal{A}_t ] (Krach et al., 2022, Andersson et al., 2023). Conditional independence of observations replaces previous restrictive independence assumptions, aligning theory with realistic data acquisition mechanisms.

4. Extensions: Noisy and Dependent Observations

PD-NJ-ODEs are extended to noisy settings by using a noise-adapted loss and pre-jump predictions, ensuring the estimator targets the conditional expectation given noisy data and not the raw noisy observations (Andersson et al., 2023).

For dependent observation times (e.g., clinical triggers based on patient state), the model and convergence proof are generalized to conditional independence: observation mechanisms may depend on the past, but not the unobserved present. The same universal estimation is obtained without algorithmic changes (Andersson et al., 2023).

5. Generative Modeling and Online Filtering

PD-NJ-ODEs and their generative variants (NJODE as generative models (Crowell et al., 3 Oct 2025)) approximate both drift and diffusion coefficients for path-dependent Itô processes from data alone. Training on conditional prediction tasks (e.g., next state, increments, quadratic increments) allows the model to recover instantaneous coefficients: μ^tΔ=1ΔE[Xt+ΔXtAt]\hat{\mu}_t^\Delta = \frac{1}{\Delta} \mathbb{E}[ X_{t+\Delta} - X_t \mid \mathcal{A}_t ] which are then used for Euler–Maruyama simulation to generate new sample paths under the learned law.

The framework supports conditional path generation based on any discrete, irregular observational history, robustly handling missing data and incomplete sampling without imputation.

6. Practical Applications and Empirical Results

PD-NJ-ODEs have been empirically validated across domains:

  • Marked and classical point processes: Event intensity estimation, path-dependent Hawkes/self-correcting/seismic processes (Jia et al., 2019).
  • Chaotic dynamical systems: Learning of double pendulum dynamics from samples, improving long-term prediction via input-skipping and output-feedback mechanisms (Krach et al., 26 Jul 2024).
  • Non-Markovian, long-memory time series: Fractional Brownian motion, limit order book, and medical time-series forecasting (Krach et al., 2022).
  • Irregular and incomplete data: Robustness to missingness, noisy measurements, and dependent sampling schedules (Andersson et al., 2023, Crowell et al., 3 Oct 2025).

In all cases, PD-NJ-ODEs demonstrated either theoretical or strong empirical advantage over standard NJ-ODEs, ODE-RNN, GRU-ODE-Bayes, and other neural sequence models, particularly when path dependence or event-driven jumps are statistically relevant.

7. Model Components and Summary Table

Component Function Neural Parameterization
Latent ODE flow Continuous-time evolution (path-dependent) fθ1f_{\theta_1} (MLP, signature/summary input)
Jump/Reset module Discrete update at event/observation times ρθ2\rho_{\theta_2} (MLP/Encoder, path/signature input)
Output map Prediction from latent state gθ3g_{\theta_3} (MLP)
History encoding Path summary (signature, mask, stats) πm()\pi_m(\cdot) or equivalent

PD-NJ-ODEs enable principled, universal, and optimal online forecasting, filtering, and generative modeling for hybrid systems exhibiting both continuous flows and discrete, path-dependent jumps—operating effectively even in imperfect, non-Markovian, and irregular observational scenarios, with robust theoretical foundations and empirical efficacy (Krach et al., 2022, Andersson et al., 2023, Krach et al., 26 Jul 2024, Crowell et al., 3 Oct 2025).

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Path-Dependent Neural Jump Ordinary Differential Equations (PD-NJ-ODEs).