Papers
Topics
Authors
Recent
2000 character limit reached

Multiscale Inference for Diffusion Models

Updated 25 November 2025
  • The paper introduces a multiscale inference scheme that uses contrast minimization to estimate drift parameters from slow-scale observations in systems with fast–slow dynamics.
  • It employs a stochastic Taylor expansion to approximate the behavior of the slow process and derive effective limits in both homogenization and averaging regimes.
  • It establishes consistency and asymptotic normality for the estimators while addressing computational challenges with simplified covariance structures.

A multiscale inference scheme for diffusion models provides a principled methodology for inferring unknown parameters or latent processes in systems governed by stochastic dynamics with multiple well-separated time scales. These schemes address the challenges posed by disparate dynamical regimes—often encountered in physical, biological, and engineered systems—where a slow “coarse” process is coupled with rapidly fluctuating “fast” variables. Observational data typically only capture the slow component, necessitating statistically efficient and robust inference strategies that account for the multiscale structure and its homogenization or averaging limits.

1. Multiscale Diffusion Models: Fast–Slow Dynamics

Consider a multiscale stochastic system on a fixed time horizon [0,T][0,T] consisting of coupled slow and fast variables (Xtε,Ytε)(X^\varepsilon_t, Y^\varepsilon_t) with dynamics

dXtε=ϵδbθ(Xtε,Ytε)dt+cθ(Xtε,Ytε)dt+ϵσ(Xtε,Ytε)dWt dYtε=ϵδ2f(Xtε,Ytε)dt+1δg(Xtε,Ytε)dt+ϵδτ1(Xtε,Ytε)dWt+ϵδτ2(Xtε,Ytε)dBt,\begin{aligned} dX^{\varepsilon}_t &= \frac{\epsilon}{\delta}\,b_\theta(X^{\varepsilon}_t,Y^{\varepsilon}_t)\,dt + c_\theta(X^{\varepsilon}_t,Y^{\varepsilon}_t)\,dt + \sqrt{\epsilon}\,\sigma(X^{\varepsilon}_t,Y^{\varepsilon}_t)\,dW_t \ dY^{\varepsilon}_t &= \frac{\epsilon}{\delta^2}\,f(X^{\varepsilon}_t,Y^{\varepsilon}_t)\,dt + \frac1\delta\,g(X^{\varepsilon}_t,Y^{\varepsilon}_t)\,dt + \frac{\sqrt\epsilon}{\delta}\,\tau_1(X^{\varepsilon}_t,Y^{\varepsilon}_t)\,dW_t + \frac{\sqrt\epsilon}{\delta}\,\tau_2(X^{\varepsilon}_t,Y^{\varepsilon}_t)\,dB_t, \end{aligned}

where WW and BB are independent Wiener processes, ϵ>0\epsilon>0 is the noise scale, and δ(ϵ)\delta(\epsilon) is a small parameter controlling scale separation, with δ0\delta \to 0 as ϵ0\epsilon \to 0. The target is to estimate unknown drift parameters θ\theta from observations of XεX^\varepsilon alone.

Two asymptotic regimes are central:

  • Homogenization regime (ϵ/δ\epsilon/\delta \to \infty): The fast process is sufficiently rapid to justify a homogenized effective equation for XεX^\varepsilon, under a centering condition on bθb_\theta with respect to the invariant measure μ,x\mu_{\infty, x} of the frozen fast dynamics.
  • Averaging regime (ϵ/δγ(0,)\epsilon/\delta \to \gamma\in(0,\infty)): The effective coefficients involve averages over the fast variable with finite memory.

2. Stochastic Taylor Expansion and Effective Limit

The stochastic behavior of XεX^\varepsilon is approximated by an expansion: XtεXˉ,t+ϵφtε,X^{\varepsilon}_t \approx \bar X_{*,t} + \sqrt{\epsilon}\,\varphi^{\varepsilon}_t, where Xˉ,t\bar X_{*,t} solves the deterministic averaged ODE

Xˉ˙,t=λˉ(Xˉ,t),Xˉ,0=x0,\dot{\bar X}_{*,t} = \bar\lambda_*(\bar X_{*,t}), \qquad \bar X_{*,0}=x_0,

with λˉ(x)\bar\lambda_*(x) given by an average of explicit functions of the coefficients over the fast variable (different for homogenization and averaging).

A second-order pathwise stochastic Taylor expansion is derived: XtεXˉ,tϵ=10tZ(t,s)yΦg(Xˉ,s)ds+0tZ(t,s)(σ+yψτ1)dWs+0tZ(t,s)yψτ2dBs+oP(1),\frac{X^{\varepsilon}_t-\bar X_{*,t}}{\sqrt\epsilon} = \frac1{\ell_*}\int_0^t Z_{*}(t,s)\,\overline{\nabla_y\Phi_*\cdot g}(\bar X_{*,s})\,ds + \int_0^tZ_{*}(t,s)\,(\sigma+\nabla_y\psi_*\cdot\tau_1)\,dW_s + \int_0^tZ_{*}(t,s)\,\nabla_y\psi_*\cdot\tau_2\,dB_s + o_P(1), where Z(t,s)Z_*(t,s) is the fundamental solution of the linearized system, and ψ\psi_* and Φ\Phi_* are solutions of Poisson equations associated with the fast process. This expansion provides the statistical basis for an approximate transition density of XεX^\varepsilon and motivates the ensuing estimation schemes.

3. Minimum Contrast and Simplified Estimators

Utilizing the expansion, the inference strategy is to define an explicit Gaussian “misspecified model” for the increments of XεX^\varepsilon between discrete observation times tk=kΔ,Δ=T/nt_k = k\Delta, \, \Delta = T/n. For any trial parameter θ\theta:

  • Minimum Contrast Estimator (MCE):
    • For each k=1,,nk=1,\ldots,n, compute

    Fkε(θ):=[xtkXˉtkθ]Zθ(tk,tk1)[xtk1Xˉtk1θ]ϵtk1tkZθ(tk,s)Jˉθ(Xˉsθ)dsF_k^\varepsilon(\theta) := [x_{t_k} - \bar X^\theta_{t_k}] - Z^\theta(t_k, t_{k-1})[x_{t_{k-1}} - \bar X^\theta_{t_{k-1}}] - \sqrt\epsilon\int_{t_{k-1}}^{t_k}Z^\theta(t_k,s)\,\bar J^\theta(\bar X^\theta_s)\,ds

    and the corresponding covariance increment

    Qk(θ):=tk1tkZθ(tk,s)qˉθ(Xˉsθ)Zθ(tk,s)Tds.Q_k(\theta) := \int_{t_{k-1}}^{t_k} Z^\theta(t_k,s)\,\bar q^\theta(\bar X^\theta_s)\,Z^\theta(t_k,s)^{T}\,ds. - The contrast functional is then

    Uε(θ)=k=1n[Fkε(θ)]TQk(θ)1Fkε(θ).U^\varepsilon(\theta) = \sum_{k=1}^n [F_k^\varepsilon(\theta)]^T Q_k(\theta)^{-1} F_k^\varepsilon(\theta). - The MCE is θ^MCEε=argminθUε(θ)\hat\theta_{\rm MCE}^\varepsilon = \arg\min_{\theta} U^\varepsilon(\theta).

  • Simplified MCE (SMCE):

    • Omits covariance weighting and the ϵ\sqrt\epsilon drift-correction term. Define

    F~k(θ):=[xtkXˉtkθ]Zθ(tk,tk1)[xtk1Xˉtk1θ].\tilde F_k(\theta) := [x_{t_k} - \bar X^\theta_{t_k}] - Z^\theta(t_k, t_{k-1})[x_{t_{k-1}} - \bar X^\theta_{t_{k-1}}]. - The SMCE is θ~SMCEε=argminθk=1nF~k(θ)2\tilde\theta^\varepsilon_{\rm SMCE} = \arg\min_{\theta} \sum_{k=1}^n \|\tilde F_k(\theta)\|^2.

Both estimators require solving the deterministic averaged ODE and its linearization. The MCE additionally solves for Poisson correctors and covariance weights.

4. Asymptotic Theory and Efficiency

The main theoretical results are:

  • Consistency: Both MCE and SMCE are consistent for θ\theta as ϵ0\epsilon\to 0 at fixed nn:

θ^MCEεPθ0,θ~SMCEεPθ0.\hat\theta^\varepsilon_{\rm MCE}\xrightarrow{P}\theta_0, \qquad \tilde\theta^\varepsilon_{\rm SMCE}\xrightarrow{P}\theta_0.

  • Asymptotic normality: Under identifiability and regularity assumptions,

θ^MCEεθ0ϵdN(0,M(θ0)),θ~SMCEεθ0ϵdN(0,M~(θ0)),\frac{\hat\theta^\varepsilon_{\rm MCE} - \theta_0}{\sqrt\epsilon} \,{\rightarrow_d}\, N\bigl(0, M(\theta_0)\bigr), \qquad \frac{\tilde\theta^\varepsilon_{\rm SMCE} - \theta_0}{\sqrt\epsilon} \,{\rightarrow_d}\, N\bigl(0, \tilde M(\theta_0)\bigr),

with explicit covariance M(θ0)M(\theta_0) for MCE (achieving efficiency in the limit), and a larger M~(θ0)\tilde M(\theta_0) for SMCE reflecting its misspecification.

  • High-frequency observations: For both estimators, consistency is retained if ϵ=o(Δ)\epsilon = o(\Delta) (SMCE), and ϵ=o(Δ2)\epsilon = o(\Delta^2) (MCE), with Δ=T/n0\Delta=T/n\to0. The limit covariances match the continuous-time Fisher information.

These properties confirm the statistical optimality (in the Cramér–Rao sense) of MCE for the estimation of drift parameters in the multiscale regime, even though only the slow process is observed.

5. Averaging and Homogenization: Regime Distinctions

The two asymptotic regimes determine the appropriate form of the effective drift, diffusion, and Poisson correctors:

  • In the homogenization regime, a centering condition for bθb_\theta must be enforced, and effective coefficients derive from solutions to associated Poisson equations for the generator of the fast dynamics.

  • In the averaging regime, effective terms involve averages over the fast variable at finite memory length.

Both regimes lead to the same estimator structure but differ in the computation of these coefficients.

A critical practical point is that neither estimator requires explicit knowledge of ϵ\epsilon or δ\delta: the only inputs are observation times and functional forms of the SDE coefficients. The estimation procedures rely on statistical moment-matching against effective models, sidestepping the need for direct modeling of the fast process.

6. Implementation: Algorithmic Outline and Usage

A canonical workflow for the multiscale inference scheme is:

  1. Preprocessing: Given discrete-time data {xtk}k=0n\{x_{t_k}\}_{k=0}^n, set Δ=T/n\Delta=T/n.

  2. For trial parameter θ\theta:

    • Solve the deterministic averaged ODE Xˉ˙sθ=λˉθ(Xˉsθ)\dot{\bar X}^\theta_s=\bar\lambda^\theta(\bar X^\theta_s) with initial xt0x_{t_0}.
    • Solve the linearized ODE for Zθ(t,s)Z^\theta(t,s).
    • Solve relevant Poisson equations to obtain Jˉθ\bar J^\theta and qˉθ\bar q^\theta.
    • For each kk, compute Fkε(θ)F_k^\varepsilon(\theta) and Qk(θ)Q_k(\theta) (MCE) or F~k(θ)\tilde F_k(\theta) (SMCE).
  3. Contrast minimization: Minimize Uε(θ)U^\varepsilon(\theta) (MCE) or U~(θ)\tilde U(\theta) (SMCE) over θ\theta to obtain the parameter estimate.
  4. Uncertainty quantification: Estimate asymptotic covariance M(θ^)M(\hat\theta) or M~(θ^)\tilde M(\hat\theta) for confidence intervals.

This scheme is robust to noise levels and sampling rates provided ϵ\epsilon and Δ\Delta satisfy the separation conditions for the chosen estimator. For large nn, high-frequency observations may require subsampling unless the strong scaling conditions are met.

7. Broader Applicability and Regime Guidance

The guiding philosophy of these multiscale inference methods is to exploit the analytic homogenization/averaging limit of the slow-fast system, employing locally Gaussian approximations and moment-based contrast minimization. No direct subsampling or explicit estimation of fast-scale parameters is required. Practical recommendations include:

  • Use MCE for maximal statistical efficiency when computational resources suffice for covariance weight computation.
  • Use SMCE for greater robustness or in contexts where the full covariance structure is computationally prohibitive.
  • Ensure regularity and identifiability conditions (e.g., nondegeneracy of Fisher information) for asymptotic guarantees.

This class of multiscale inference schemes provides a rigorous, computationally feasible, and statistically consistent framework for parameter estimation in multiscale diffusion systems observed at the slow scale, accommodating both homogenization and averaging regimes and enabling efficient uncertainty quantification (Gailus et al., 2017).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Multiscale Inference Scheme for Diffusion Models.