Sampling probabilities, diffusions, ancestral graphs, and duality under strong selection (2312.17406v3)
Abstract: Wright-Fisher diffusions and their dual ancestral graphs occupy a central role in the study of allele frequency change and genealogical structure, and they provide expressions, explicit in some special cases but generally implicit, for the sampling probability, a crucial quantity in inference. Under a finite-allele mutation model, with possibly parent-dependent mutation, we consider the asymptotic regime where the selective advantage of one allele grows to infinity, while the other parameters remain fixed. In this regime, we show that the Wright-Fisher diffusion can be approximated either by a Gaussian process or by a process whose components are independent continuous-state branching processes with immigration, aligning with analogous results for Wright-Fisher models but employing different methods. While the first process becomes degenerate at stationarity, the latter does not and provides a simple, analytic approximation for the leading term of the sampling probability. Furthermore, using another approach based on a recursion formula, we characterise all remaining terms to provide a full asymptotic expansion for the sampling probability. Finally, we study the asymptotic behaviour of the rates of the block-counting process of the conditional ancestral selection graph and establish an asymptotic duality relationship between this and the diffusion.
- F. Alberti. Asymptotic sampling distributions made easy: loose linkage in the ancestral recombination graph, 2023+. arXiv:2301.07394v1.
- E. Baake and R. Bialowons. Ancestral processes with selection: Branching and Moran models. Banach Center Publications, 80(1):33–52, 2008.
- P. Baldi. Stochastic Calculus. Springer Cham, 2017.
- A transition function expansion for a diffusion model with selection. Ann. Appl. Probab., 10(1):123–162, 02 2000.
- Higher Transcendental Functions [Volumes I-III]. McGraw-Hill Book Company, 1953.
- A. Bhaskar and Y. S. Song. Closed-form asymptotic sampling distributions under the coalescent with recombination for an arbitrary number of loci. Advances in Applied Probability, 44(2):391–407, 2012.
- A theory of the term structure of interest rates. Econometrica, 53:385–407, 1985.
- A coalescent dual process in a Moran model with genic selection. Theoretical Population Biology, 75:320–330, 2009.
- Markov processes: characterization and convergence. John Wiley & Sons, 1986.
- S. N. Ethier and T. Nagylaki. Diffusion approximations of Markov chains with two time scales and applications to population genetics, ii. Advances in Applied Probability, 20(3):525–545, 1988.
- W. T. L. Fan and J. Wakeley. Latent mutations in the ancestries of alleles under selection, 2023+. arXiv:2306.08142v1.
- M. Favero and H. Hult. Asymptotic behaviour of sampling and transition probabilities in coalescent models under selection and parent dependent mutations. Electronic Communications in Probability, 27:1–13, 2022.
- A dual process for the coupled Wright-Fisher diffusion. Journal of Mathematical Biology, 82(6), 2021.
- P. Fearnhead. The common ancestor at a nonneutral locus. Journal of Applied Probability, 39(1):38–54, 2002.
- P. Fearnhead. Haplotypes: the joint distribution of alleles at linked loci. Journal of applied probability, 40:205–512, 2003.
- Identifying Signatures of Selection in Genetic Time Series. Genetics, 196(2):509–522, 02 2014.
- W. Feller. Diffusion processes in genetics. In Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, 1951.
- Importance sampling and the two-locus model with subdivided population structure. Advances in applied probability, 40(2):473–500, 2008.
- Closed-form two-locus sampling distributions: accuracy and universality. Genetics, 183(3):1087–1103, 11 2009.
- An asymptotic sampling formula for the coalescent with Recombination. The Annals of Applied Probability, 20(3):1005–1028, 2010.
- Padé approximants and exact two-locus sampling distributions. The Annals of Applied Probability, 22(2):576–607, 2012.
- P. A. Jenkins and D. Spanò. Exact simulation of the Wright–Fisher diffusion. The Annals of Applied Probability, 27(3):1478 – 1509, 2017.
- Tractable diffusion and coalescent processes for weakly correlated loci. Electronic Journal of Probability, 20:1–25, 2015.
- S. Jensen and N. Kurt. On the notion(s) of duality for Markov processes. Probability Surveys, 11:59–120, 2014.
- Central limit theorems and diffusion approximations for multiscale Markov chain models. Annals of Applied Probability, 24(2):721–759, 2014.
- K. Kawazu and S. Watanabe. Branching processes with immigration and related limit theorems. Theory of Probability & Its Applications, 16(1):36–54, 1971.
- S. M. Krone and C. Neuhauser. Ancestral processes with selection. Theoretical Population Biology, 51:210–237, 1997.
- T. G. Kurtz. Limit theorems for sequences of jump Markov processes approximating ordinary differential processes. Journal of Applied Probability, 8(2):344–356, 1971.
- M. Lacerda and C. Seoighe. Population genetics inference for longitudinally-sampled mutants under strong selection. Genetics, 198:1237–1250, 2014.
- Z. Li. Continuous-State Branching Processes with Immigration. Springer Singapore, 2020.
- R. N. Makarov and D. Glew. Exact simulation of Bessel diffusions. Monte Carlo Methods and Applications, 16(3-4):283–306, 2010.
- M. Möhle. A convergence theorem for Markov chains arising in population genetics and the coalescent with selfing. Advances in Applied Probability, 30(2):493–512, 1998.
- T. Nagylaki. The Gaussian approximation for random genetic drift. In Evolutionary processes and theory, 1986.
- T. Nagylaki. Models and approximations for random genetic drift. Theoretical Population Biology, 37:192–212, 1990.
- C. Neuhauser and S. M. Krone. The genealogy of samples in models with selection. Genetics, 154:519–534, 1997.
- M. F. Norman. Markov Processes and Learning Models. Academic Press, 1972.
- M. F. Norman. A central limit theorem for Markov processes that move by small steps. Annals of Probability, 2:1065–1074, 1974.
- M. F. Norman. Approximation of stochastic processes by Gaussian diffusions, and applications to Wright-Fisher genetic models. Siam Journal on Applied Mathematics, 29:225–242, 1975a.
- M. F. Norman. Limit theorems for stationary distributions. Advances in Applied Probability, 7:561–575, 1975b.
- O. Papaspiliopoulos and M. Ruggiero. Optimal filtering and the dual process. Bernoulli, 20(4):1999–2019, 2014.
- EWF: simulating exact paths of the Wright–Fisher diffusion. Bioinformatics, 39(1):btad017, 01 2023.
- T. Shiga. Diffusion processes in population genetics. Journal of Mathematics of Kyoto University, 21(1):133–151, 1981.
- P. F. Slade. Simulation of selected genealogies. Theoretical Population Biology, 57(1):35–49, 2000a.
- P. F. Slade. Most recent common ancestor probability distributions in gene genealogies under selection. Theoretical Population Biology, 58(4):291–305, 2000b.
- M. Stephens and P. Donnelly. Ancestral inference in population genetics models with selection (with discussion). Australian & New Zealand Journal of Statistics, 45(4):395–430, 2003.
- J. Wakeley. Conditional gene genealogies under strong purifying selection. Molecular Biology and Evolution, 25(12):2615–2626, 09 2008.
- S. Wright. Adaption and selection. In G. L. Jepson, E. Mayr, and G. Simpson, editors, Genetics, Paleontology, and Evolution, pages 365–389. Princeton Univ. Press, 1949.