Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Identifying Drift, Diffusion, and Causal Structure from Temporal Snapshots (2410.22729v2)

Published 30 Oct 2024 in stat.ML, cs.LG, math.ST, and stat.TH

Abstract: Stochastic differential equations (SDEs) are a fundamental tool for modelling dynamic processes, including gene regulatory networks (GRNs), contaminant transport, financial markets, and image generation. However, learning the underlying SDE from data is a challenging task, especially if individual trajectories are not observable. Motivated by burgeoning research in single-cell datasets, we present the first comprehensive approach for jointly identifying the drift and diffusion of an SDE from its temporal marginals. Assuming linear drift and additive diffusion, we prove that these parameters are identifiable from marginals if and only if the initial distribution lacks any generalized rotational symmetries. We further prove that the causal graph of any SDE with additive diffusion can be recovered from the SDE parameters. To complement this theory, we adapt entropy-regularized optimal transport to handle anisotropic diffusion, and introduce APPEX (Alternating Projection Parameter Estimation from $X_0$), an iterative algorithm designed to estimate the drift, diffusion, and causal graph of an additive noise SDE, solely from temporal marginals. We show that APPEX iteratively decreases Kullback-Leibler divergence to the true solution, and demonstrate its effectiveness on simulated data from linear additive noise SDEs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (94)
  1. Field study of dispersion in a heterogeneous aquifer: 2. spatial moments analysis. Water Resources Research, 28(12):3293–3307, 1992.
  2. Hirotogu Akaike. Information theory and an extension of the maximum likelihood principle. In Selected papers of Hirotugu Akaike, pages 199–213. Springer, 1998.
  3. Aristotle. Generation of Animals, volume 366 of Loeb Classical Library. Harvard University Press, Cambridge, MA, 1942.
  4. Dyngfn: Towards bayesian inference of gene regulatory networks with gflownets. Advances in Neural Information Processing Systems, 36, 2024.
  5. Gene regulatory network inference from sparsely sampled noisy data. Nature communications, 11(1):3493, 2020.
  6. An entropy minimization approach to second-order variational mean-field games. Mathematical Models and Methods in Applied Sciences, 29(08):1553–1583, 2019.
  7. The schrödinger bridge between gaussian measures has a closed form. In International Conference on Artificial Intelligence and Statistics, pages 5802–5833. PMLR, 2023.
  8. Dispersion parameters for undisturbed partially saturated soil. Journal of hydrology, 143(1-2):19–43, 1993.
  9. Jaya PN Bishwal. Parameter estimation in stochastic differential equations. Springer, 2007.
  10. Differentiable causal discovery from interventional data. Advances in Neural Information Processing Systems, 33:21865–21877, 2020.
  11. Dynamic structural causal models. arXiv preprint arXiv:2406.01161, 2024.
  12. Exact and computationally efficient likelihood-based estimation for discretely observed diffusion processes (with discussion). Journal of the Royal Statistical Society Series B: Statistical Methodology, 68(3):333–382, 2006.
  13. Field study of dispersion in a heterogeneous aquifer: 1. overview and site description. Water Resources Research, 28(12):3281–3291, 1992.
  14. L Campbell. Reduction theorems for the strong real jacobian conjecture. In Annales Polonici Mathematici, volume 110, pages 1–11. Institute of Mathematics Polish Academy of Sciences, 2014.
  15. Advancing knowledge of gas migration and fugitive gas from energy wells in northeast british columbia, canada. Greenhouse Gases: Science and Technology, 9(2):134–151, 2019.
  16. Assessment and prediction of contaminant transport and migration at a florida superfund site. Environmental monitoring and assessment, 57:291–299, 1999.
  17. Statistical inference for stochastic differential equations. Wiley Interdisciplinary Reviews: Computational Statistics, 15(2):e1585, 2023.
  18. Noise distorts the epigenetic landscape and shapes cell-fate decisions. Cell Systems, 13(1):83–102, 2022.
  19. Parameter estimation for fractional transport: A particle-tracking approach. Water resources research, 45(10), 2009.
  20. Trajectory inference via mean-field langevin in path space. Advances in Neural Information Processing Systems, 35:16731–16742, 2022.
  21. Simulating solute transport in porous or fractured formations using random walk particle tracking: A review. Vadose Zone Journal, 4(2):360–379, 2005.
  22. Vanessa Didelez. Graphical models for marked point processes based on local independence. Journal of the Royal Statistical Society Series B: Statistical Methodology, 70(1):245–264, 2008.
  23. Joseph L Doob. The brownian movement and stochastic equations. Annals of Mathematics, 43(2):351–369, 1942.
  24. Amro MM Elfeki. Prediction of contaminant plumes (shapes, spatial moments and macrodispersion) in aquifers with insufficient geological information. Journal of Hydraulic Research, 44(6):841–856, 2006.
  25. Adaptive euler–maruyama method for sdes with nonglobally lipschitz drift. The Annals of Applied Probability, 30(2):526–560, 2020.
  26. Well vulnerability: a quantitative approach for source water protection. Groundwater, 44(5):732–742, 2006.
  27. Aden Forrow. Consistent diffusion matrix estimation from population time series. arXiv preprint arXiv:2408.14408, 2024.
  28. Single-cell reconstruction of developmental trajectories during zebrafish embryogenesis. Science, 360(6392):eaar3131, 2018.
  29. Sample complexity of sinkhorn divergences. In The 22nd international conference on artificial intelligence and statistics, pages 1574–1583. PMLR, 2019.
  30. Daniel T Gillespie. The chemical langevin equation. The Journal of Chemical Physics, 113(1):297–306, 2000.
  31. Stability of entropic optimal transport and schrödinger bridges. Journal of Functional Analysis, 283(9):109622, 2022.
  32. Miranda Holmes-Cerfon. Applied stochastic analysis, 2015.
  33. Learning population-level diffusions with generative rnns. In International Conference on Machine Learning, pages 2417–2426. PMLR, 2016.
  34. Causal interpretation of stochastic differential equations. Electronic Journal of Probability, 2014.
  35. Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of Approximate Reasoning, 49(2):362–378, 2008.
  36. Hanwen Huang. One-step data-driven generative model via schr\\\backslash\” odinger bridge. arXiv preprint arXiv:2405.12453, 2024.
  37. Evolution of plume geometry, dilution and reactive mixing in porous media under highly transient flow fields at the surface water-groundwater interface. Journal of Contaminant Hydrology, 258:104243, 2023.
  38. Hicham Janati. Advances in Optimal transport and applications to neuroscience. PhD thesis, Institut Polytechnique de Paris, 2021.
  39. A multiresolution method for parameter estimation of diffusion processes. Journal of the American Statistical Association, 107(500):1558–1574, 2012.
  40. A simple contaminant fate and transport modelling tool for management and risk assessment of groundwater pollution from contaminated sites. Journal of contaminant hydrology, 221:35–49, 2019.
  41. Johannes Ledolter. Estimation bias in the first-order autoregressive model and its impact on predictions and prediction intervals. Communications in Statistics-Simulation and Computation, 38(4):771–787, 2009.
  42. Stochastic population dynamics in ecology and conservation. Oxford University Press, USA, 2003.
  43. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature machine intelligence, 3(3):218–229, 2021.
  44. New form of dispersion tensor for axisymmetric porous media with implementation in particle tracking. Water Resources Research, 38(8):21–1, 2002.
  45. Causal modeling with stationary diffusions. In International Conference on Artificial Intelligence and Statistics, pages 1927–1935. PMLR, 2024.
  46. Learning linear non-gaussian graphical models with multidirected edges. Journal of Causal Inference, 9(1):250–263, 2021.
  47. Towards a mathematical theory of trajectory inference. arXiv preprint arXiv:2102.09204, 2021.
  48. Signature kernel conditional independence tests in causal discovery for stochastic processes. arXiv preprint arXiv:2402.18477, 2024.
  49. A natural gradient experiment on solute transport in a sand aquifer: 1. approach and overview of plume movement. Water Resources Research, 22(13):2017–2029, 1986.
  50. Markov equivalence of marginalized local independence graphs. The Annals of Statistics, 48(1):539–559, 2020.
  51. Graphical modeling of stochastic processes driven by correlated noise. Bernoulli, 28(4):3023–3050, 2022.
  52. A variational analysis of stochastic gradient algorithms. In International conference on machine learning, pages 354–363. PMLR, 2016.
  53. Statistical inference for dynamical systems: A review. Statist. Surv., 2015.
  54. Stochastic partial differential equation-based model for suspended sediment transport in surface water flows. Journal of engineering mechanics, 133(4):422–430, 2007.
  55. A comparison of methods for estimating parameters of the stochastic lomax process: through simulation study. Hacettepe Journal of Mathematics and Statistics, 53(2):495–505, 2024.
  56. Parameter estimation in stochastic differential equations: an overview. Annual Reviews in Control, 24:83–94, 2000.
  57. Nonparametric statistical inference for drift vector fields of multi-dimensional diffusions. The Annals of Statistics, 48(3):1383–1408, 2020.
  58. Marcel Nutz. Introduction to entropic optimal transport. Lecture notes, Columbia University, 2021.
  59. William Ogle et al. Aristotle: on the parts of animals. Kegan Paul, French & Company, 1882.
  60. Dennis R O’Connor et al. Part two: Report of the walkerton inquiry: A strategy for safe drinking water. Technical report, Ministry of the Attorney General, 2002.
  61. Bernt Oksendal. Stochastic differential equations: an introduction with applications. Springer Science & Business Media, 2013.
  62. A stochastic jump diffusion particle-tracking model (sjd-ptm) for sediment transport in open channel flows. Water Resources Research, 46(10), 2010.
  63. Mustafa Özdemir. An alternative approach to elliptical motion. Advances in Applied Clifford Algebras, 26:279–304, 2016.
  64. Anthony J Paulson. The transport and fate of fe, mn, cu, zn, cd, pb and so4 in a groundwater plume and in downstream surface waters in the coeur d’alene mining district, idaho, usa. Applied Geochemistry, 12(4):447–464, 1997.
  65. Grigorios A Pavliotis. Stochastic processes and applications. Texts in Applied Mathematics, 60, 2014.
  66. Computational optimal transport: With applications to data science. Foundations and Trends® in Machine Learning, 11(5-6):355–607, 2019.
  67. Lagrangian observations of estuarine residence times, dispersion, and trapping in the salish sea. Estuarine, Coastal and Shelf Science, 225:106246, 2019.
  68. Maximum likelihood estimation for short time series with replicated observations: A simulation study. InterStat, 9:1–16, 2003.
  69. The matrix cookbook. Technical University of Denmark, 7(15):510, 2008.
  70. Bicycle: Intervention-based causal discovery with cycles. In Causal Learning and Reasoning, pages 209–242. PMLR, 2024.
  71. Beware of the simulated dag! causal discovery benchmarks may be easy to game. Advances in Neural Information Processing Systems, 34:27772–27784, 2021.
  72. Jakob Runge. Necessary and sufficient graphical conditions for optimal adjustment sets in causal graphical models with hidden variables. Advances in Neural Information Processing Systems, 34:15762–15773, 2021.
  73. Learning a vector field from snapshots of unidentified particles rather than particle trajectories. In ICLR 2024 Workshop on AI4DifferentialEquations In Science, 2024.
  74. Multi-marginal schr\\\backslash\” odinger bridges with iterative reference. arXiv preprint arXiv:2408.06277, 2024.
  75. Modeling tracer transport at the made site: The importance of heterogeneity. Water Resources Research, 43(8), 2007.
  76. A linear non-gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7(10), 2006.
  77. Applied stochastic differential equations, volume 10. Cambridge University Press, 2019.
  78. Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming. Cell, 176(4):928–943, 2019.
  79. Daniel W Stroock. Partial differential equations for probabalists [sic]. Cambridge University Press, 2008.
  80. Wishbone identifies bifurcating developmental trajectories from single-cell data. Nature biotechnology, 34(6):637–645, 2016.
  81. Stochastic parameter estimation in nonlinear time-delayed vibratory systems with distributed delay. Journal of Sound and Vibration, 332(14):3404–3418, 2013.
  82. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nature biotechnology, 32(4):381–386, 2014.
  83. Causal machine learning for single-cell genomics. arXiv preprint arXiv:2310.14935, 2023.
  84. Solving schrödinger bridges via maximum likelihood. Entropy, 23(9):1134, 2021.
  85. CH Waddington. How animal develop, 1935.
  86. CH Waddington. The strategy of the genes: a discussion of some aspecs of theoretical biology. London: Allen and Unwin, 1957.
  87. Generator identification for linear sdes with additive and multiplicative noise. Advances in Neural Information Processing Systems, 36, 2024.
  88. Halbert White. Maximum likelihood estimation of misspecified models. Econometrica: Journal of the econometric society, pages 1–25, 1982.
  89. Multivariable mathematics: linear algebra, differential equations, calculus. (No Title), 1974.
  90. Dictys: dynamic gene regulatory network dissects developmental continuum with single-cell multiomics. Nature Methods, 20(9):1368–1378, 2023.
  91. Fundamental limits on dynamic inference from single-cell snapshots. Proceedings of the National Academy of Sciences, 115(10):E2467–E2476, 2018.
  92. scegot: Single-cell trajectory inference framework based on entropic gaussian mixture optimal transport. bioRxiv, pages 2023–09, 2023.
  93. Stephen Y Zhang. Joint trajectory and network inference via reference fitting. arXiv preprint arXiv:2409.06879, 2024.
  94. Optimal transport reveals dynamic gene regulatory networks via gene velocity estimation. bioRxiv, pages 2024–09, 2024.

Summary

We haven't generated a summary for this paper yet.

Youtube Logo Streamline Icon: https://streamlinehq.com