Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An Association Test Based on Kernel-Based Neural Networks for Complex Genetic Association Analysis (2312.06669v1)

Published 6 Dec 2023 in q-bio.QM, cs.LG, and stat.ME

Abstract: The advent of artificial intelligence, especially the progress of deep neural networks, is expected to revolutionize genetic research and offer unprecedented potential to decode the complex relationships between genetic variants and disease phenotypes, which could mark a significant step toward improving our understanding of the disease etiology. While deep neural networks hold great promise for genetic association analysis, limited research has been focused on developing neural-network-based tests to dissect complex genotype-phenotype associations. This complexity arises from the opaque nature of neural networks and the absence of defined limiting distributions. We have previously developed a kernel-based neural network model (KNN) that synergizes the strengths of linear mixed models with conventional neural networks. KNN adopts a computationally efficient minimum norm quadratic unbiased estimator (MINQUE) algorithm and uses KNN structure to capture the complex relationship between large-scale sequencing data and a disease phenotype of interest. In the KNN framework, we introduce a MINQUE-based test to assess the joint association of genetic variants with the phenotype, which considers non-linear and non-additive effects and follows a mixture of chi-square distributions. We also construct two additional tests to evaluate and interpret linear and non-linear/non-additive genetic effects, including interaction effects. Our simulations show that our method consistently controls the type I error rate under various conditions and achieves greater power than a commonly used sequence kernel association test (SKAT), especially when involving non-linear and interaction effects. When applied to real data from the UK Biobank, our approach identified genes associated with hippocampal volume, which can be further replicated and evaluated for their role in the pathogenesis of Alzheimer's disease.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. T. A. Manolio, F. S. Collins, N. J. Cox, D. B. Goldstein, L. A. Hindorff, D. J. Hunter, M. I. McCarthy, E. M. Ramos, L. R. Cardon, A. Chakravarti, et al., “Finding the missing heritability of complex diseases,” Nature, vol. 461, no. 7265, pp. 747–753, 2009.
  2. B. Maher, “Personal genomes: The case of the missing heritability,” Nature, vol. 456, pp. 18–21, Nov 2008.
  3. E. E. Eichler, J. Flint, G. Gibson, A. Kong, S. M. Leal, J. H. Moore, and J. H. Nadeau, “Missing heritability and strategies for finding the underlying causes of complex disease,” Nature reviews genetics, vol. 11, no. 6, pp. 446–450, 2010.
  4. P. C. Phillips, “Epistasis—the essential role of gene interactions in the structure and evolution of genetic systems,” Nature Reviews Genetics, vol. 9, no. 11, pp. 855–867, 2008.
  5. L. A. Hindorff, P. Sethupathy, H. A. Junkins, E. M. Ramos, J. P. Mehta, F. S. Collins, and T. A. Manolio, “Potential etiologic and functional implications of genome-wide association loci for human diseases and traits,” Proceedings of the National Academy of Sciences, vol. 106, no. 23, pp. 9362–9367, 2009.
  6. C. Angermueller, T. Pärnamaa, L. Parts, and O. Stegle, “Deep learning for computational biology,” Molecular systems biology, vol. 12, no. 7, p. 878, 2016.
  7. N. Sapoval, A. Aghazadeh, M. G. Nute, D. A. Antunes, A. Balaji, R. Baraniuk, C. Barberan, R. Dannenfelser, C. Dun, M. Edrisi, et al., “Current progress and open challenges for applying deep learning across the biosciences,” Nature Communications, vol. 13, no. 1, p. 1728, 2022.
  8. S. Chakraborty, R. Tomsett, R. Raghavendra, D. Harborne, M. Alzantot, F. Cerutti, M. Srivastava, A. Preece, S. Julier, R. M. Rao, et al., “Interpretability of deep learning models: A survey of results,” in 2017 IEEE smartworld, ubiquitous intelligence & computing, advanced & trusted computed, scalable computing & communications, cloud & big data computing, Internet of people and smart city innovation (smartworld/SCALCOM/UIC/ATC/CBDcom/IOP/SCI), pp. 1–6, IEEE, 2017.
  9. X. Shen, X. Tong, and Q. Lu, “A kernel-based neural network for high-dimensional genetic risk prediction analysis,” 2021.
  10. T. Hou, C. Jiang, and Q. Lu, “A kernel-based neural network test for high-dimensional sequencing data analysis,” arXiv preprint arXiv:2312.02850, 2023.
  11. M. C. Wu, S. Lee, T. Cai, Y. Li, M. Boehnke, and X. Lin, “Rare-variant association testing for sequencing data with the sequence kernel association test,” The American Journal of Human Genetics, vol. 89, no. 1, pp. 82–93, 2011.
  12. C. Sudlow, J. Gallacher, N. Allen, V. Beral, P. Burton, J. Danesh, P. Downey, P. Elliott, J. Green, M. Landray, et al., “Uk biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age,” PLoS Medicine, vol. 12, no. 3, p. e1001779, 2015.
  13. C. R. Rao, “Estimation of heteroscedastic variances in linear models,” Journal of the American Statistical Association, vol. 65, no. 329, pp. 161–172, 1970.
  14. C. R. Rao, “Estimation of variance and covariance components—minque theory,” Journal of multivariate analysis, vol. 1, no. 3, pp. 257–275, 1971.
  15. C. R. Rao, “Estimation of variance and covariance components in linear models,” Journal of the American Statistical Association, vol. 67, no. 337, pp. 112–115, 1972.
  16. H. D. Patterson and R. Thompson, “Recovery of inter-block information when block sizes are unequal,” Biometrika, vol. 58, no. 3, pp. 545–554, 1971.
  17. D. Liu, X. Lin, and D. Ghosh, “Semiparametric regression of multidimensional genetic pathway data: Least-squares kernel machines and linear mixed models,” Biometrics, vol. 63, no. 4, pp. 1079–1088, 2007.
  18. D. Liu, D. Ghosh, and X. Lin, “Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models,” BMC bioinformatics, vol. 9, pp. 1–11, 2008.
  19. P. Duchesne and P. L. De Micheaux, “Computing the distribution of quadratic forms: Further comparisons between the liu–tang–zhang approximation and exact methods,” Computational Statistics & Data Analysis, vol. 54, no. 4, pp. 858–862, 2010.
  20. R. B. Davies, “Algorithm as 155: The distribution of a linear combination of χ𝜒\chiitalic_χ 2 random variables,” Applied Statistics, pp. 323–333, 1980.
  21. C. Bycroft, C. Freeman, D. Petkova, G. Band, L. T. Elliott, K. Sharp, A. Motyer, D. Vukcevic, O. Delaneau, J. O’Connell, et al., “The uk biobank resource with deep phenotyping and genomic data,” Nature, vol. 562, no. 7726, pp. 203–209, 2018.
  22. C. Bycroft, C. Freeman, D. Petkova, G. Band, L. T. Elliott, K. Sharp, A. Motyer, D. Vukcevic, O. Delaneau, J. O’Connell, et al., “Genome-wide genetic data on~ 500,000 uk biobank participants,” BioRxiv, p. 166298, 2017.
  23. C. C. Chang, C. C. Chow, L. C. Tellier, S. Vattikuti, S. M. Purcell, and J. J. Lee, “Second-generation plink: rising to the challenge of larger and richer datasets,” Gigascience, vol. 4, no. 1, pp. s13742–015, 2015.
  24. D. Van der Meer, J. Rokicki, T. Kaufmann, A. Córdova-Palomera, T. Moberget, D. Alnæs, F. Bettella, O. Frei, N. T. Doan, I. E. Sønderby, et al., “Brain scans from 21,297 individuals reveal the genetic architecture of hippocampal subfield volumes,” Molecular psychiatry, vol. 25, no. 11, pp. 3053–3065, 2020.
  25. F. Ayhan, A. Kulkarni, S. Berto, K. Sivaprakasam, C. Douglas, B. C. Lega, and G. Konopka, “Resolving cellular and molecular diversity along the hippocampal anterior-to-posterior axis in humans,” Neuron, vol. 109, no. 13, pp. 2091–2105, 2021.
  26. J. P. Whitelegge, “Integral membrane proteins and bilayer proteomics,” Analytical chemistry, vol. 85, no. 5, pp. 2558–2568, 2013.
  27. J. C. Grigston, H. M. VanDongen, J. O. Mcnamara Ii, and A. M. VanDongen, “Translation of an integral membrane protein in distal dendrites of hippocampal neurons,” European Journal of Neuroscience, vol. 21, no. 6, pp. 1457–1468, 2005.
  28. W. Ma, R. Liu, K. Zhao, and J. Zhong, “Vital role of shmt2 in diverse disease,” Biochemical and Biophysical Research Communications, vol. 671, pp. 160–165, 2023.
  29. À. García-Cazorla, E. Verdura, N. Juliá-Palacios, E. N. Anderson, L. Goicoechea, L. Planas-Serra, E. Tsogtbaatar, N. R. Dsouza, A. Schlüter, R. Urreizti, et al., “Impairment of the mitochondrial one-carbon metabolism enzyme shmt2 causes a novel brain and heart developmental syndrome,” Acta neuropathologica, vol. 140, pp. 971–975, 2020.
  30. F. Coppedè, “One-carbon metabolism and alzheimer’s disease: focus on epigenetics,” Current genomics, vol. 11, no. 4, pp. 246–260, 2010.
  31. K. Kaleckỳ, P. Ashcraft, and T. Bottiglieri, “One-carbon metabolism in alzheimer’s disease and parkinson’s disease brain tissue,” Nutrients, vol. 14, no. 3, p. 599, 2022.
  32. A. S. Yokoyama, J. C. Rutledge, and V. Medici, “Dna methylation alterations in alzheimer’s disease,” Environmental epigenetics, vol. 3, no. 2, p. dvx008, 2017.
  33. S. Hebert, “Genetic associations of alzheimer’s disease and mild cognitive impairment,” 2023.
  34. C. Clark, L. Dayon, M. Masoodi, G. L. Bowman, and J. Popp, “An integrative multi-omics approach reveals new central nervous system pathway alterations in alzheimer’s disease,” Alzheimer’s research & therapy, vol. 13, no. 1, pp. 1–19, 2021.
  35. J. Wong, “Altered expression of rna splicing proteins in alzheimer’s disease patients: evidence from two microarray studies,” Dementia and Geriatric Cognitive Disorders Extra, vol. 3, no. 1, pp. 74–85, 2013.

Summary

We haven't generated a summary for this paper yet.