Multiscale Inference for Diffusion Models
- The paper introduces a multiscale inference scheme that uses contrast minimization to estimate drift parameters from slow-scale observations in systems with fast–slow dynamics.
- It employs a stochastic Taylor expansion to approximate the behavior of the slow process and derive effective limits in both homogenization and averaging regimes.
- It establishes consistency and asymptotic normality for the estimators while addressing computational challenges with simplified covariance structures.
A multiscale inference scheme for diffusion models provides a principled methodology for inferring unknown parameters or latent processes in systems governed by stochastic dynamics with multiple well-separated time scales. These schemes address the challenges posed by disparate dynamical regimes—often encountered in physical, biological, and engineered systems—where a slow “coarse” process is coupled with rapidly fluctuating “fast” variables. Observational data typically only capture the slow component, necessitating statistically efficient and robust inference strategies that account for the multiscale structure and its homogenization or averaging limits.
1. Multiscale Diffusion Models: Fast–Slow Dynamics
Consider a multiscale stochastic system on a fixed time horizon consisting of coupled slow and fast variables with dynamics
where and are independent Wiener processes, is the noise scale, and is a small parameter controlling scale separation, with as . The target is to estimate unknown drift parameters from observations of alone.
Two asymptotic regimes are central:
- Homogenization regime (): The fast process is sufficiently rapid to justify a homogenized effective equation for , under a centering condition on with respect to the invariant measure of the frozen fast dynamics.
- Averaging regime (): The effective coefficients involve averages over the fast variable with finite memory.
2. Stochastic Taylor Expansion and Effective Limit
The stochastic behavior of is approximated by an expansion: where solves the deterministic averaged ODE
with given by an average of explicit functions of the coefficients over the fast variable (different for homogenization and averaging).
A second-order pathwise stochastic Taylor expansion is derived: where is the fundamental solution of the linearized system, and and are solutions of Poisson equations associated with the fast process. This expansion provides the statistical basis for an approximate transition density of and motivates the ensuing estimation schemes.
3. Minimum Contrast and Simplified Estimators
Utilizing the expansion, the inference strategy is to define an explicit Gaussian “misspecified model” for the increments of between discrete observation times . For any trial parameter :
- Minimum Contrast Estimator (MCE):
- For each , compute
and the corresponding covariance increment
- The contrast functional is then
- The MCE is .
Simplified MCE (SMCE):
- Omits covariance weighting and the drift-correction term. Define
- The SMCE is .
Both estimators require solving the deterministic averaged ODE and its linearization. The MCE additionally solves for Poisson correctors and covariance weights.
4. Asymptotic Theory and Efficiency
The main theoretical results are:
- Consistency: Both MCE and SMCE are consistent for as at fixed :
- Asymptotic normality: Under identifiability and regularity assumptions,
with explicit covariance for MCE (achieving efficiency in the limit), and a larger for SMCE reflecting its misspecification.
- High-frequency observations: For both estimators, consistency is retained if (SMCE), and (MCE), with . The limit covariances match the continuous-time Fisher information.
These properties confirm the statistical optimality (in the Cramér–Rao sense) of MCE for the estimation of drift parameters in the multiscale regime, even though only the slow process is observed.
5. Averaging and Homogenization: Regime Distinctions
The two asymptotic regimes determine the appropriate form of the effective drift, diffusion, and Poisson correctors:
In the homogenization regime, a centering condition for must be enforced, and effective coefficients derive from solutions to associated Poisson equations for the generator of the fast dynamics.
In the averaging regime, effective terms involve averages over the fast variable at finite memory length.
Both regimes lead to the same estimator structure but differ in the computation of these coefficients.
A critical practical point is that neither estimator requires explicit knowledge of or : the only inputs are observation times and functional forms of the SDE coefficients. The estimation procedures rely on statistical moment-matching against effective models, sidestepping the need for direct modeling of the fast process.
6. Implementation: Algorithmic Outline and Usage
A canonical workflow for the multiscale inference scheme is:
Preprocessing: Given discrete-time data , set .
For trial parameter :
- Solve the deterministic averaged ODE with initial .
- Solve the linearized ODE for .
- Solve relevant Poisson equations to obtain and .
- For each , compute and (MCE) or (SMCE).
- Contrast minimization: Minimize (MCE) or (SMCE) over to obtain the parameter estimate.
- Uncertainty quantification: Estimate asymptotic covariance or for confidence intervals.
This scheme is robust to noise levels and sampling rates provided and satisfy the separation conditions for the chosen estimator. For large , high-frequency observations may require subsampling unless the strong scaling conditions are met.
7. Broader Applicability and Regime Guidance
The guiding philosophy of these multiscale inference methods is to exploit the analytic homogenization/averaging limit of the slow-fast system, employing locally Gaussian approximations and moment-based contrast minimization. No direct subsampling or explicit estimation of fast-scale parameters is required. Practical recommendations include:
- Use MCE for maximal statistical efficiency when computational resources suffice for covariance weight computation.
- Use SMCE for greater robustness or in contexts where the full covariance structure is computationally prohibitive.
- Ensure regularity and identifiability conditions (e.g., nondegeneracy of Fisher information) for asymptotic guarantees.
This class of multiscale inference schemes provides a rigorous, computationally feasible, and statistically consistent framework for parameter estimation in multiscale diffusion systems observed at the slow scale, accommodating both homogenization and averaging regimes and enabling efficient uncertainty quantification (Gailus et al., 2017).