Papers
Topics
Authors
Recent
Search
2000 character limit reached

Linear Multistep Methods for Discovery (LMM-DMD)

Updated 15 February 2026
  • LMM-DMD is a data-driven framework that uses classical linear multistep methods and neural approximators to recover unknown vector fields in ODE models from sampled state data.
  • It rigorously decomposes the overall error into discretization and network approximation components, with convergence guarantees based on consistency and stability criteria.
  • The method has been validated on a range of systems—from linear oscillators to chaotic and high-dimensional biochemical networks—demonstrating broad applicability.

Linear Multistep Methods for Discovery (LMM-DMD) are a class of algorithms for data-driven inference of continuous-time dynamical systems from observed state data. LMM-DMD leverages classical linear multistep methods (LMMs) for discretizing time derivatives, combining them with expressive neural approximators—such as feedforward neural networks or Kolmogorov–Arnold networks (KANs)—to recover the vector fields governing unknown ordinary differential equations (ODEs). Rigorous error analyses underpin the approach, decomposing the discovery error into contributions from both numerical discretization and function approximation.

1. Mathematical Framework

The central objective of LMM-DMD is the recovery of an unknown vector field f:RdRdf : \mathbb{R}^d \to \mathbb{R}^d in the autonomous ODE

x˙(t)=f(x(t)),x(0)=x0Rd,\dot x(t) = f(x(t)), \qquad x(0) = x_0 \in \mathbb{R}^d,

given a sequence of uniformly sampled state observations xn=x(tn)x_n = x(t_n), with tn=nh,  n=0,,N,t_n = n h, \; n=0, \dots, N, and step size hh.

A general ss-step LMM for approximating time derivatives has the form

j=0sαjxn+j=hj=0sβjf(xn+j),\sum_{j=0}^s \alpha_j x_{n+j} = h \sum_{j=0}^s \beta_j f(x_{n+j}),

where the coefficients {αj}\{\alpha_j\} and {βj}\{\beta_j\} encode the specific numerical scheme: e.g., Adams–Bashforth (AB), Adams–Moulton (AM), or Backward Differentiation Formula (BDF).

The inverse problem is formulated by treating the sampled {xn}\{x_n\} as fixed and seeking to identify ff (or its parametric approximation fθf_\theta) so that the LMM residuals are minimized or vanish: Jh(θ)=1Ns+1n=0Nsj=0sαjxn+jhj=0sβjfθ(xn+j)2.J_h(\theta) = \frac{1}{N-s+1} \sum_{n=0}^{N-s} \left\| \sum_{j=0}^s \alpha_j x_{n+j} - h \sum_{j=0}^{s} \beta_j f_\theta(x_{n+j}) \right\|^2. Parametric representations using neural architectures such as deep feedforward networks or B-spline-based Kolmogorov–Arnold networks (KANs) are adopted for fθf_\theta. Auxiliary conditions, such as finite-difference initializations for f(xj)f(x_j) at early time steps, may be imposed as necessary to guarantee a unique minimizer when the resulting system is under- or over-determined (Hu et al., 25 Jan 2025, Du et al., 2021, Keller et al., 2019).

2. Neural Network and Spline-Based Approximators

The LMM-DMD framework accommodates various function classes for fθf_\theta:

  • Feedforward Neural Networks: Deep fully connected networks (e.g., ReLU or tanh activations) parameterize fθf_\theta. A typical architecture employs depth LL, uniform width WW, and nonlinearity σ(z)\sigma(z):

fθ(x)=WL+1σ(WLσ(σ(W1x+b1))+bL)+bL+1.f_\theta(x) = W_{L+1} \, \sigma( W_L \, \sigma( \cdots \sigma(W_1 x + b_1) \cdots ) + b_L ) + b_{L+1}.

The parameter count is O(LW2)O(L W^2) (Du et al., 2021, Zhu et al., 2022).

  • Kolmogorov–Arnold Networks (KANs): For increased approximation capacity and explicit error bounds, two-layer KANs with B-spline activation functions are constructed. Each scalar component fi(x)f_i(x) is approximated as

ϕ(x)=j=1Nϕ1,j(i=1dϕ0,j,i(xi)),\phi(x) = \sum_{j=1}^N \phi_{1,j} \left( \sum_{i=1}^d \phi_{0,j,i}(x_i) \right),

where ϕl,(ξ)=q=0G+k1cq(l)Bq(k)(ξ)\phi_{l,\ast}(\xi) = \sum_{q=0}^{G+k-1} c^{(l)}_q B_q^{(k)}(\xi) employs B-splines of degree kk and GG interior knots.

The approximation error of KANs is quantified in the uniform norm: fϕL([0,1]d)N(LGd+1)ωf(min{(2k2)1/2,k/12G}),\|f-\phi\|_{L^\infty([0,1]^d)} \leq N (L_G d + 1) \, \omega_f( \min\{ (2k-2)^{-1/2}, \sqrt{ k/12G } \} ), where LGL_G denotes a spline-induced Lipschitz constant, and ωf(r)\omega_f(r) is the modulus of continuity of ff (Hu et al., 25 Jan 2025).

3. Discretization, Consistency, and Stability in Discovery

The convergence theory for LMM-DMD diverges from that of forward integration. The defining LMM system for the inverse problem is

m=0Mαmynmhm=0Mβmf(ynm)=0,\sum_{m=0}^{M} \alpha_m y_{n-m} - h \sum_{m=0}^{M} \beta_m f(y_{n-m}) = 0,

with data {yn}\{y_n\}. Consistency and stability criteria for convergence in the discovery context differ as follows (Keller et al., 2019):

  • Consistency: The local defect τh\tau_h on the trajectory must satisfy τh=O(hp)\|\tau_h\|_\infty = O(h^p), where pp is the LMM order.
  • Stability: The root condition for the second characteristic polynomial σ(ζ)=m=0MβmζMm\sigma(\zeta) = \sum_{m=0}^M \beta_m \zeta^{M-m} governs the stability and conditioning of the recovery of ff. The strong root condition (all roots inside the unit disc) ensures bounded error growth with NN.
  • Convergence Theorem: If the second polynomial satisfies the strong root condition and the method is consistent, then

maxMnNf^(yn)f(yn)=O(hp).\max_{M \leq n \leq N} \| \hat f(y_n) - f_\ast(y_n) \| = O(h^p).

Analysis for specific schemes yields:

Scheme Orders with Discovery Convergence Root Condition
Adams–Bashforth 1M61 \leq M \leq 6 Strong root holds
Adams–Moulton M=0,1M=0,1 only M2M\geq 2 unstable
BDF M1M \geq 1 Always stable

(Keller et al., 2019, Du et al., 2021)

4. Error Bounds: Approximation and Discretization Splitting

The LMM-DMD error admits an explicit decomposition into discretization and learning (approximation) terms: fθ(xn)f(xn)h2Cκ2(Ah)(hp+εnet),\| f_\theta(x_n) - f(x_n) \|_{\ell^2_h} \leq C \kappa_2(A_h) \left( h^p + \varepsilon_{\rm net} \right), where εnet\varepsilon_{\rm net} is the best-approximation error achievable by fθf_\theta, κ2(Ah)\kappa_2(A_h) is the condition number of the LMM matrix (uniformly bounded for zero-stable schemes), and CC is independent of hh and network size (Du et al., 2021, Hu et al., 25 Jan 2025).

For KANs, uniform-norm error can be further bounded via the Kolmogorov-superposition theorem and explicit B-spline regularity. Provided that ff is Hölder continuous with exponent α\alpha,

fϕλN(LGd+1)(min{(2k2)1/2,k/12G})α.\| f - \phi \|_\infty \leq \lambda N (L_G d + 1) \left( \min\{ (2k-2)^{-1/2}, \sqrt{ k/12G }\} \right)^\alpha.

The solution-trajectory error for the learned fθf_\theta, by Grönwall's inequality, satisfies

maxnx(tn)x~(tn)C(hp+εnet).\max_n \| x(t_n) - \tilde x(t_n) \| \leq C( h^p + \varepsilon_{\rm net} ).

Error analyses using inverse modified differential equations (IMDE) further show that the total approximation error on compact domains can be expressed as

Kfθfc2hp+c0L,\int_K \| f_{\theta^*} - f \| \leq c_2 h^p + c_0 \sqrt{ \mathcal{L} },

where L\mathcal{L} is the learning (training) loss (Zhu et al., 2022).

5. Representative Numerical Experiments

LMM-DMD has been systematically validated across a range of ODE benchmarks:

  • Linear and Nonlinear Oscillators: Recovery of simple, damped, and cubic oscillators using both explicit (AB) and implicit (BDF, AM) LMMs. Empirically, fitted-field errors decay at O(hp)O(h^p) with decreasing step size until limited by approximation errors, precisely as predicted by theory.
  • Lorenz '63 (Chaotic): On canonical chaotic trajectories, both Adams–Bashforth and BDF methods of orders $1$–$4$ yield convergence rates matching the multistep order for sufficiently small hh. For long prediction horizons, errors grow exponentially due to positive Lyapunov exponents.
  • Biochemical Networks (e.g., Glycolytic Oscillator, d=7d=7): Using AM(1) with KAN (k=3k=3, G=64G=64, h=103h=10^{-3}), the method accurately recovers all species time-series in both training and extrapolation regions, retaining small sup-norm errors.

In all cases, the observed grid errors mirror theoretical predictions, exhibiting polynomial decay in hh and saturation at the neural approximation threshold. BDF and low-order AB schemes are especially robust in high-dimensional or stiff regimes (Hu et al., 25 Jan 2025, Du et al., 2021, Keller et al., 2019, Zhu et al., 2022).

6. Practical Considerations, Limitations, and Recommendations

Robust application of LMM-DMD mandates attention to several practical aspects:

  • Scheme Selection: Only LMMs whose second characteristic polynomial satisfies the strong root condition ensure stability for discovery. BDF methods (any order) and AB up to sixth order are safe. AM is limited to AM-0 (implicit Euler) and AM-1 (trapezoid rule).
  • Auxiliary Initial Conditions: Over- or under-determined linear systems may require finite-difference approximations for initialization, especially for explicit methods such as AB; BDF methods are less dependent on such conditions.
  • Grid Step Size and Network Capacity: To ensure error is dominated by discretization rather than network approximation, one should choose hh so that hph^p exceeds the learning floor, with networks sufficiently wide/deep as needed (εnethp\varepsilon_{\rm net} \ll h^p).
  • Analyticity and High-Frequency Modes: Theoretical guarantees rely on sufficient regularity (analyticity) of both the true and learned vector fields. Practical implementations benefit from the implicit spectral bias of SGD, but explicit regularization may be required for stiff or highly oscillatory systems.
  • Limitations: High-dimensional systems and stiff or multiscale dynamics increase the required data and network size. Large NN or GG in KANs/Kolmogorov–Arnold approaches may be computationally expensive. Theoretical results assume noiseless, densely sampled trajectories, and extension to noisy or incomplete data remains an open area.

LMM-DMD provides a sharp analytic and algorithmic foundation for data-driven ODE discovery, integrating high-order numerical schemes and expressive neural approximators with rigorous a priori error control (Hu et al., 25 Jan 2025, Du et al., 2021, Keller et al., 2019, Zhu et al., 2022).

7. Research Directions and Applications

The LMM-DMD framework is broadly applicable across domains where the governing dynamical law is unknown, but state trajectories can be sampled:

  • System Identification: Physics, biology (e.g., cellular signaling networks, gene regulatory dynamics), engineering (control systems), and neuroscience (population dynamics).
  • Model Discovery with Noisy Data: While the canonical theory addresses noiseless data, extensions incorporating regularization or robust statistics are becoming increasingly relevant.
  • PDE Discovery and Multi-Scale Systems: Extensions that incorporate spatial discretization, run multi-stage or multirate schemes, or recover partial differential equations are active research areas.
  • Comparison and Integration with Other Methods: LMM-DMD complements approaches such as Runge–Kutta discovery or sparse regression (e.g., SINDy), providing high-order accuracy and error bounds conditioned on neural-network representational power.

LMM-DMD thus constitutes a rigorous, versatile, and high-accuracy paradigm for the inverse modeling of continuous-time dynamics from time-series data, with proven convergence theory and broad relevance across scientific domains.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Linear Multistep Methods for Discovery (LMM-DMD).