Statistical Spatially Inhomogeneous Diffusion Inference (2312.05793v1)
Abstract: Inferring a diffusion equation from discretely-observed measurements is a statistical challenge of significant importance in a variety of fields, from single-molecule tracking in biophysical systems to modeling financial instruments. Assuming that the underlying dynamical process obeys a $d$-dimensional stochastic differential equation of the form $$\mathrm{d}\boldsymbol{x}_t=\boldsymbol{b}(\boldsymbol{x}_t)\mathrm{d} t+\Sigma(\boldsymbol{x}_t)\mathrm{d}\boldsymbol{w}_t,$$ we propose neural network-based estimators of both the drift $\boldsymbol{b}$ and the spatially-inhomogeneous diffusion tensor $D = \Sigma\Sigma{T}$ and provide statistical convergence guarantees when $\boldsymbol{b}$ and $D$ are $s$-H\"older continuous. Notably, our bound aligns with the minimax optimal rate $N{-\frac{2s}{2s+d}}$ for nonparametric function estimation even in the presence of correlation within observational data, which necessitates careful handling when establishing fast-rate generalization bounds. Our theoretical results are bolstered by numerical experiments demonstrating accurate inference of spatially-inhomogeneous diffusion tensors.
- Sup-norm adaptive simultaneous drift estimation for ergodic diffusions. arXiv:1808.10660.
- Aït-Sahalia, Y. 2002. Maximum likelihood estimation of discretely sampled diffusions: a closed-form approximation approach. Econometrica, 70(1): 223–262.
- Local rademacher complexities. The Annals of Statistics, 33(4): 1497 – 1537.
- Martingale estimation functions for discretely observed diffusion processes. Bernoulli, 17–39.
- Fokker–Planck–Kolmogorov Equations, volume 207. American Mathematical Society.
- Concentration inequalities: A nonasymptotic theory of independence. Oxford university press.
- Bradley, R. C. 2005. Basic properties of strong mixing conditions. A survey and some open questions. Probability Surveys, 2(none): 107 – 144.
- Nonparametric regression on low-dimensional manifolds using deep ReLU networks: Function approximation and statistical recovery. Information and Inference: A Journal of the IMA, 11(4): 1203–1253.
- On the representation of solutions to elliptic pdes in barron spaces. Advances in neural information processing systems, 34: 6454–6465.
- Diffusion estimation from multiscale data by operator eigenpairs. Multiscale Modeling & Simulation, 9(4): 1588–1623.
- Dalalyan, A. 2005. Sharp adaptive estimation of the drift function for ergodic diffusions. The Annals of Statistics, 33(6): 2507 – 2528.
- Asymptotic statistical equivalence for scalar ergodic diffusions. Probability theory and related fields, 134: 248–282.
- Asymptotic statistical equivalence for ergodic diffusions: the multidimensional case. Probability theory and related fields, 137: 25–47.
- Estimation of drift and diffusion functions from unevenly sampled time-series data. Physical Review E, 106(1): 014140.
- Self-normalized processes: exponential inequalities, moment bounds and iterated logarithm laws. The Annals of Probability, 32(3): 1902 – 1933.
- Exponential and uniform ergodicity of Markov processes. The Annals of Probability, 23(4): 1671–1691.
- Convergence rate analysis for deep ritz method. arXiv:2103.13330.
- Likelihood inference for discretely observed nonlinear diffusions. Econometrica, 69(4): 959–993.
- Learning force fields from stochastic trajectories. Physical Review X, 10(2): 021009.
- Stationary density estimation of itô diffusions using deep learning. SIAM Journal on Numerical Analysis, 61(1): 45–82.
- Solving high-dimensional partial differential equations using deep learning. Proceedings of the National Academy of Sciences, 115(34): 8505–8510.
- Learning theory estimates with observations from general stationary stochastic processes. Neural computation, 28(12): 2853–2889.
- A Bernstein-type inequality for some mixing processes and dynamical systems with an application to learning. The Annals of Statistics, 45(2): 708 – 743.
- Hoffmann, M. 1997. Minimax estimation of the diffusion coefficient through irregular samplings. Statistics & probability letters, 32(1): 11–24.
- Hoffmann, M. 1999a. Adaptive estimation in diffusion processes. Stochastic processes and their Applications, 79(1): 135–163.
- Hoffmann, M. 1999b. Lp estimation of the diffusion coefficient. Bernoulli, 447–481.
- Minimax estimation of smooth optimal transport maps. The Annals of Statistics, 49(2): 1166 – 1194.
- Physics-informed machine learning. Nature Reviews Physics, 3(6): 422–440.
- Solving parametric PDE problems with artificial neural networks. European Journal of Applied Mathematics, 32(3): 421–435.
- Koltchinskii, V. 2006. Local Rademacher complexities and oracle inequalities in risk minimization. The Annals of Statistics, 34(6): 2593 – 2656.
- Neural operator: Learning maps between function spaces. arXiv:2108.08481.
- Kulik, A. 2017. Ergodic behavior of Markov processes. In Ergodic Behavior of Markov Processes. de Gruyter.
- Generalization bounds for non-stationary mixing processes. Machine Learning, 106(1): 93–117.
- State-dependent diffusion: Thermodynamic consistency and its path integral formulation. Physical Review E, 76(1): 011123.
- A semigroup method for high dimensional committor functions based on neural network. In Mathematical and Scientific Machine Learning, 598–618. PMLR.
- Markov neural operators for learning chaotic systems. arXiv:2106.06898.
- Computing high-dimensional invariant distributions from noisy data. Journal of Computational Physics, 474: 111783.
- Pde-net: Learning pdes from data. In International conference on machine learning, 3208–3216. PMLR.
- Parametric estimation of diffusion processes: A review and comparative study. Mathematics, 9(8): 859.
- Deeponet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv:1910.03193.
- Sobolev acceleration and statistical optimality for learning elliptic equations via gradient descent. arXiv:2205.07331.
- Machine learning for elliptic pdes: fast rate generalization bound, neural scaling law and minimax optimality. arXiv:2110.06897.
- Parametric complexity bounds for approximating PDEs with neural networks. Advances in Neural Information Processing Systems, 34: 15044–15055.
- Neural Network Approximations of PDEs Beyond Linearity: Representational Perspective. arXiv:2210.12101.
- Rademacher complexity bounds for non-iid processes. Advances in Neural Information Processing Systems, 21.
- Stability Bounds for Stationary φ𝜑\varphiitalic_φ-mixing and β𝛽\betaitalic_β-mixing Processes. Journal of Machine Learning Research, 11(2).
- Nickl, R. 2022. Inference for diffusions from low frequency measurements. arXiv:2210.13008.
- Nonparametric statistical inference for drift vector fields of multi-dimensional diffusions. The Annals of Statistics, 48(3): 1383 – 1408.
- Nonparametric Bayesian posterior contraction rates for discretely observed scalar diffusions. The Annals of Statistics, 45(4): 1664 – 1693.
- Convergence rates for penalized least squares estimators in PDE constrained regression problems. SIAM/ASA Journal on Uncertainty Quantification, 8(1): 374–413.
- Drift estimation for a multi-dimensional diffusion process using deep neural networks. arXiv:2112.13332.
- Approximation and non-parametric estimation of ResNet-type convolutional neural networks. In International conference on machine learning, 4922–4931. PMLR.
- Ozaki, T. 1992. A bridge between nonlinear time series models and nonlinear stochastic dynamical systems: a local linearization approach. Statistica Sinica, 113–135.
- Asymptotic analysis for periodic structures. Elsevier.
- Nonparametric estimation of diffusions: a differential equations approach. Biometrika, 99(3): 511–531.
- Pedersen, A. R. 1995. Consistency and asymptotic normality of an approximate maximum likelihood estimator for discretely observed diffusion processes. Bernoulli, 257–279.
- Posterior consistency via precision operators for Bayesian nonparametric drift estimation in SDEs. Stochastic Processes and their Applications, 123(2): 603–628.
- Remarks on drift estimation for diffusion processes. Multiscale modeling & simulation, 8(1): 69–95.
- Active importance sampling for variational objectives dominated by rare events: Consequences for optimization and generalization. In Mathematical and Scientific Machine Learning, 757–780. PMLR.
- Dynamical computation of the density of states and Bayes factors using nonequilibrium importance sampling. Physical review letters, 122(15): 150602.
- Schmidt-Hieber, J. 2020. Nonparametric regression using deep neural networks with ReLU activation function. The Annals of Statistics, 48(4): 1875 – 1897.
- Estimation for nonlinear stochastic differential equations by a local linearization method. Stochastic Analysis and Applications, 16(4): 733–752.
- Sørensen, H. 2004. Parametric inference for diffusion processes observed at discrete points in time: a survey. International Statistical Review, 72(3): 337–354.
- Fast learning from non-iid observations. Advances in neural information processing systems, 22.
- Strauch, C. 2015. Sharp adaptive drift estimation for ergodic diffusions: the multivariate case. Stochastic Processes and their Applications, 125(7): 2562–2602.
- Strauch, C. 2016. Exact adaptive pointwise drift estimation for multidimensional ergodic diffusions. Probability Theory and Related Fields, 164: 361–400.
- Multidimensional diffusion processes, volume 233. Springer Science & Business Media.
- A note on estimating drift and diffusion parameters from timeseries. Physics Letters A, 305(5): 304–311.
- Suzuki, T. 2018. Adaptivity of deep ReLU network for learning in Besov and mixed smooth Besov spaces: optimal rate and curse of dimensionality. arXiv:1810.08033.
- Introduction to nonparametric estimation. Springer-Verlag New York.
- Convergence rates of posterior distributions for Brownian semimartingale models. Bernoulli, 12(5): 863–888.
- Gaussian process methods for one-dimensional diffusions: Optimal rates and adaptation. Electronic Journal of Statistics, 10(1): 628 – 645.
- Neural network-based parameter estimation of stochastic differential equations driven by Lévy noise. Physica A: Statistical Mechanics and its Applications, 606: 128146.
- Some observations on high-dimensional partial differential equations with Barron data. In Mathematical and Scientific Machine Learning, 253–269. PMLR.
- The estimation of parameters for stochastic differential equations using neural networks. Inverse Problems in Science and Engineering, 15(6): 629–641.
- Yu, B. 1994. Rates of convergence for empirical processes of stationary mixing sequences. The Annals of Probability, 94–116.
- Yu, B.; et al. 2018. The deep Ritz method: a deep learning-based numerical algorithm for solving variational problems. Communications in Mathematics and Statistics, 6(1): 1–12.
- Weak adversarial networks for high-dimensional partial differential equations. Journal of Computational Physics, 411: 109409.
- Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Physical review letters, 120(14): 143001.
- Zhang, T. 2004. Statistical behavior and consistency of classification methods based on convex risk minimization. The Annals of Statistics, 32(1): 56–85.
- Single trajectory nonparametric learning of nonlinear dynamics. In conference on Learning Theory, 3333–3364. PMLR.