A Kernel-Based Neural Network Test for High-dimensional Sequencing Data Analysis (2312.02850v2)
Abstract: The recent development of AI technology, especially the advance of deep neural network (DNN) technology, has revolutionized many fields. While DNN plays a central role in modern AI technology, it has been rarely used in sequencing data analysis due to challenges brought by high-dimensional sequencing data (e.g., overfitting). Moreover, due to the complexity of neural networks and their unknown limiting distributions, building association tests on neural networks for genetic association analysis remains a great challenge. To address these challenges and fill the important gap of using AI in high-dimensional sequencing data analysis, we introduce a new kernel-based neural network (KNN) test for complex association analysis of sequencing data. The test is built on our previously developed KNN framework, which uses random effects to model the overall effects of high-dimensional genetic data and adopts kernel-based neural network structures to model complex genotype-phenotype relationships. Based on KNN, a Wald-type test is then introduced to evaluate the joint association of high-dimensional genetic data with a disease phenotype of interest, considering non-linear and non-additive effects (e.g., interaction effects). Through simulations, we demonstrated that our proposed method attained higher power compared to the sequence kernel association test (SKAT), especially in the presence of non-linear and interaction effects. Finally, we apply the methods to the whole genome sequencing (WGS) dataset from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study, investigating new genes associated with the hippocampal volume change over time.
- J. Fan, F. Han, and H. Liu, “Challenges of big data analysis,” National science review, vol. 1, no. 2, pp. 293–314, 2014.
- Academic Press, 2008.
- V. Berisha, C. Krantsevich, P. R. Hahn, S. Hahn, G. Dasarathy, P. Turaga, and J. Liss, “Digital medicine and the curse of dimensionality,” NPJ digital medicine, vol. 4, no. 1, p. 153, 2021.
- C. Angermueller, T. Pärnamaa, L. Parts, and O. Stegle, “Deep learning for computational biology,” Molecular systems biology, vol. 12, no. 7, p. 878, 2016.
- N. Sapoval, A. Aghazadeh, M. G. Nute, D. A. Antunes, A. Balaji, R. Baraniuk, C. Barberan, R. Dannenfelser, C. Dun, M. Edrisi, et al., “Current progress and open challenges for applying deep learning across the biosciences,” Nature Communications, vol. 13, no. 1, p. 1728, 2022.
- S. Chakraborty, R. Tomsett, R. Raghavendra, D. Harborne, M. Alzantot, F. Cerutti, M. Srivastava, A. Preece, S. Julier, R. M. Rao, et al., “Interpretability of deep learning models: A survey of results,” in 2017 IEEE smartworld, ubiquitous intelligence & computing, advanced & trusted computed, scalable computing & communications, cloud & big data computing, Internet of people and smart city innovation (smartworld/SCALCOM/UIC/ATC/CBDcom/IOP/SCI), pp. 1–6, IEEE, 2017.
- X. Shen, X. Tong, and Q. Lu, “A kernel-based neural network for high-dimensional genetic risk prediction analysis,” 2021.
- M. C. Wu, S. Lee, T. Cai, Y. Li, M. Boehnke, and X. Lin, “Rare-variant association testing for sequencing data with the sequence kernel association test,” The American Journal of Human Genetics, vol. 89, no. 1, pp. 82–93, 2011.
- S. G. Mueller, M. W. Weiner, L. J. Thal, R. C. Petersen, C. Jack, W. Jagust, J. Q. Trojanowski, A. W. Toga, and L. Beckett, “The alzheimer’s disease neuroimaging initiative,” Neuroimaging Clinics, vol. 15, no. 4, pp. 869–877, 2005.
- C. R. Rao, “Estimation of heteroscedastic variances in linear models,” Journal of the American Statistical Association, vol. 65, no. 329, pp. 161–172, 1970.
- C. R. Rao, “Estimation of variance and covariance components—minque theory,” Journal of multivariate analysis, vol. 1, no. 3, pp. 257–275, 1971.
- C. R. Rao, “Estimation of variance and covariance components in linear models,” Journal of the American Statistical Association, vol. 67, no. 337, pp. 112–115, 1972.
- K. G. Brown, “Asymptotic behavior of minque-type estimators of variance components,” The Annals of Statistics, vol. 4, no. 4, pp. 746–754, 1976.
- D. Follmann, “A simple multivariate test for one-sided alternatives,” Journal of the American Statistical Association, vol. 91, no. 434, pp. 854–861, 1996.
- C. Bycroft, C. Freeman, D. Petkova, G. Band, L. T. Elliott, K. Sharp, A. Motyer, D. Vukcevic, O. Delaneau, J. O’Connell, et al., “The uk biobank resource with deep phenotyping and genomic data,” Nature, vol. 562, no. 7726, pp. 203–209, 2018.
- R. C. Petersen, P. S. Aisen, L. A. Beckett, M. C. Donohue, A. C. Gamst, D. J. Harvey, C. R. Jack, W. J. Jagust, L. M. Shaw, A. W. Toga, et al., “Alzheimer’s disease neuroimaging initiative (adni): clinical characterization,” Neurology, vol. 74, no. 3, pp. 201–209, 2010.
- Y. Guan, W.-L. Kuo, J. L. Stilwell, H. Takano, A. V. Lapuk, J. Fridlyand, J.-H. Mao, M. Yu, M. A. Miller, J. L. Santos, et al., “Amplification of pvt1 contributes to the pathophysiology of ovarian and breast cancer,” Clinical cancer research, vol. 13, no. 19, pp. 5745–5755, 2007.
- M. Cui, L. You, X. Ren, W. Zhao, Q. Liao, and Y. Zhao, “Long non-coding rna pvt1 and cancer,” Biochemical and biophysical research communications, vol. 471, no. 1, pp. 10–14, 2016.
- L. Carramusa, F. Contino, A. Ferro, L. Minafra, G. Perconti, A. Giallongo, and S. Feo, “The pvt-1 oncogene is a myc protein target that is overexpressed in transformed cells,” Journal of cellular physiology, vol. 213, no. 2, pp. 511–518, 2007.
- E. Liu, Z. Liu, Y. Zhou, R. Mi, and D. Wang, “Overexpression of long non-coding rna pvt1 in ovarian cancer cells promotes cisplatin resistance by regulating apoptotic pathways,” International journal of clinical and experimental medicine, vol. 8, no. 11, p. 20565, 2015.
- D. Traversa, G. Simonetti, D. Tolomeo, G. Visci, G. Macchia, M. Ghetti, G. Martinelli, L. S. Kristensen, and C. T. Storlazzi, “Unravelling similarities and differences in the role of circular and linear pvt1 in cancer and human disease,” British journal of cancer, vol. 126, no. 6, pp. 835–850, 2022.
- T. Zhao, Y. Ding, M. Li, C. Zhou, and W. Lin, “Silencing lncrna pvt1 inhibits activation of astrocytes and increases bdnf expression in hippocampus tissues of rats with epilepsy by downregulating the wnt signaling pathway,” Journal of cellular physiology, vol. 234, no. 9, pp. 16054–16067, 2019.
- Z. Li, S. Hao, H. Yin, J. Gao, and Z. Yang, “Autophagy ameliorates cognitive impairment through activation of pvt1 and apoptosis in diabetes mice,” Behavioural Brain Research, vol. 305, pp. 265–277, 2016.
- R. Akbergenov, S. Duscha, A.-K. Fritz, R. Juskeviciene, N. Oishi, K. Schmitt, D. Shcherbakov, Y. Teo, H. Boukari, P. Freihofer, et al., “Mutant mrps 5 affects mitoribosomal accuracy and confers stress-related behavioral alterations,” EMBO reports, vol. 19, no. 11, p. e46193, 2018.
- D. Shcherbakov, R. Juskeviciene, A. Cortés Sanchón, M. Brilkova, H. Rehrauer, E. Laczko, and E. C. Böttger, “Mitochondrial mistranslation in brain provokes a metabolic response which mitigates the age-associated decline in mitochondrial gene expression,” International Journal of Molecular Sciences, vol. 22, no. 5, p. 2746, 2021.
- J. Zhang, Z. Zhang, J. Zhang, Z. Zhong, Z. Yao, S. Qu, and Y. Huang, “itraq-based protein profiling in cums rats provides insights into hippocampal ribosome lesion and ras protein changes underlying synaptic plasticity in depression,” Neural Plasticity, vol. 2019, 2019.
- L. Zhu, Q. Zhou, L. He, and L. Chen, “Mitochondrial unfolded protein response: An emerging pathway in human diseases,” Free Radical Biology and Medicine, vol. 163, pp. 125–134, 2021.
- S. Pasini, C. Corona, J. Liu, L. A. Greene, and M. L. Shelanski, “Specific downregulation of hippocampal atf4 reveals a necessary role in synaptic plasticity and memory,” Cell reports, vol. 11, no. 2, pp. 183–191, 2015.
- J. F. Torres-Peraza, T. Engel, R. Martin-Ibanez, A. Sanz-Rodriguez, M. R. Fernandez-Fernandez, M. Esgleas, J. M. Canals, D. C. Henshall, and J. J. Lucas, “Protective neuronal induction of atf5 in endoplasmic reticulum stress induced by status epilepticus,” Brain, vol. 136, no. 4, pp. 1161–1176, 2013.
- A. Zou, Z. Lin, M. Humble, C. D. Creech, P. K. Wagoner, D. Krafte, T. J. Jegla, and A. D. Wickenden, “Distribution and functional properties of human kcnh8 (elk1) potassium channels,” American Journal of Physiology-Cell Physiology, vol. 285, no. 6, pp. C1356–C1366, 2003.