Non-identifiability of identity coefficients at biallelic loci (1310.3518v1)
Abstract: Shared genealogies introduce allele dependencies in diploid genotypes, as alleles within an individual or between different individuals will likely match when they originate from a recent common ancestor. At a locus shared by a pair of diploid individuals, there are nine combinatorially distinct modes of identity-by-descent (IBD), capturing all possible combinations of coancestry and inbreeding. A distribution over the IBD modes is described by the nine associated probabilities, known as (Jacquard's) identity coefficients. The genetic relatedness between two individuals can be succinctly characterized by the identity coefficients corresponding to the joint genealogy. The identity coefficients (together with allele frequencies) determine the distribution of joint genotypes at a locus. At a locus with two possible alleles, identity coefficients are not identifiable because different coefficients can generate the same genotype distribution. We analyze precisely how different IBD modes combine into identical genotype distributions at diallelic loci. In particular, we describe IBD mode mixtures that result in identical genotype distributions at all allele frequencies, implying the non-identifiability of the identity coefficients from independent loci. Our analysis yields an exhaustive characterization of relatedness statistics that are always identifiable. Importantly, we show that identifiable relatedness statistics include the kinship coefficient (probability that a random pair of alleles are identical by descent between individuals) and inbreeding-related measures, which can thus be estimated from genotype distributions at independent loci.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.