2000 character limit reached
Deep interpretability for GWAS (2007.01516v1)
Published 3 Jul 2020 in cs.LG, q-bio.GN, stat.AP, and stat.ML
Abstract: Genome-Wide Association Studies are typically conducted using linear models to find genetic variants associated with common diseases. In these studies, association testing is done on a variant-by-variant basis, possibly missing out on non-linear interaction effects between variants. Deep networks can be used to model these interactions, but they are difficult to train and interpret on large genetic datasets. We propose a method that uses the gradient based deep interpretability technique named DeepLIFT to show that known diabetes genetic risk factors can be identified using deep models along with possibly novel associations.