Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
157 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Inferring Admixture Histories of Human Populations Using Linkage Disequilibrium (1211.0251v2)

Published 1 Nov 2012 in q-bio.PE

Abstract: Long-range migrations and the resulting admixtures between populations have been important forces shaping human genetic diversity. Most existing methods for detecting and reconstructing historical admixture events are based on allele frequency divergences or patterns of ancestry segments in chromosomes of admixed individuals. An emerging new approach harnesses the exponential decay of admixture-induced linkage disequilibrium (LD) as a function of genetic distance. Here, we comprehensively develop LD-based inference into a versatile tool for investigating admixture. We present a new weighted LD statistic that can be used to infer mixture proportions as well as dates with fewer constraints on reference populations than previous methods. We define an LD-based three-population test for admixture and identify scenarios in which it can detect admixture events that previous formal tests cannot. We further show that we can uncover phylogenetic relationships among populations by comparing weighted LD curves obtained using a suite of references. Finally, we describe several improvements to the computation and fitting of weighted LD curves that greatly increase the robustness and speed of the calculations. We implement all of these advances in a software package, ALDER, which we validate in simulations and apply to test for admixture among all populations from the Human Genome Diversity Project (HGDP), highlighting insights into the admixture history of Central African Pygmies, Sardinians, and Japanese.

Citations (439)

Summary

  • The paper presents an innovative approach using admixture-induced LD decay to precisely infer both the timing and proportions of historical admixture events.
  • The paper details a fast Fourier transform-based algorithm that significantly enhances computational speed and robustness in analyzing LD patterns.
  • The paper validates its methodology with simulated and real-world data, effectively discerning complex population histories beyond the capabilities of traditional techniques.

Overview of LD-Based Admixture Inference for Human Population Histories

The paper explores an advanced methodology for inferring the admixture history of human populations using linkage disequilibrium (LD). Admixture events, arising from historical migrations and interbreeding between different ancestral populations, leave significant genetic imprints within contemporary populations. This paper proposes novel statistical tools that utilize LD to decode the admixture proportions and dates, offering potential advantages over traditional methods reliant primarily on allele frequencies and ancestry segment analysis.

Methodology and Innovations

The key innovation lies in exploiting admixture-induced LD — the correlations between genetic loci created by admixture and broken down by recombination over generations. By modeling the decay of these LD correlations as a function of genetic distance, the researchers develop tools to model this decay using a weighted LD statistic. Mathematical models enable inferring admixture proportions and dates with fewer constraints on reference populations compared to previous techniques. The paper introduces an LD-based three-population test for admixture, which is shown to detect admixture signals that other formal tests might overlook.

Crucially, the paper describes enhancements in the computational aspects of calculating weighted LD curves, notably applying a fast Fourier transform-based algorithm, significantly boosting calculation speed and robustness. Also, improvements in curve fitting techniques allow better discrimination between true admixture signals and potential noise from other demographic processes, such as bottlenecks.

Numerical Analysis and Results

The authors validate these methodologies by applying them to both simulated data and real-world populations sourced from the Human Genome Diversity Project (HGDP). Simulations of admixtures at various time points and proportions demonstrated that the proposed LD-based method provides precise estimates of admixture timelines and measurements, corroborated through closely related or diverged reference populations. Notably, the methodology distinguishes different population histories, as shown in detailed analyses of Central African Pygmies, Sardinians, and Japanese populations, yielding insights into complex admixture events that other tests might miss.

In various case studies, the tools were demonstrated to identify admixture signals in population groups where classical methods show reduced sensitivity, especially when reference populations experience long branches of genetic drift or when admixture fractions are minute.

Practical and Theoretical Implications

This paper significantly extends the toolkit available to geneticists for studying human population history. The ease of use with one or two references, combined with its comprehensive mathematical development, positions this approach as both a complementary and alternative method to existing admixture analysis tools. Notably, its robustness across different scenarios widens the scope for research that relies on fewer assumptions regarding the availability and suitability of ancestral references.

The novel approach underscores the value of harnessing LD not only for the dating of admixture events but also for quantitatively inferring mixture proportions and elucidating complex population phylogenies. These advancements have implications in anthropology, historical linguistics, and archaeology, allowing for a better understanding of the movements and interactions of ancient human societies.

Future Directions

Future applications look towards generalizing this framework for multi-way and continuous admixture events, addressing scenarios where admixture does not occur as an isolated pulse. Moreover, integrating LD-based insights with drift-based phylogenetic models could amplify the accuracy and range of historical inferences in large-scale genomic studies. Thus, this research lays the groundwork for a broadly applicable toolkit, pivotal in untangling the intricate web of human ancestry.

This paper represents a significant methodological development in population genetics, offering nuanced insights into human admixture histories through the lens of linkage disequilibrium. By marrying sophisticated statistical modeling with cutting-edge computational techniques, the authors provide the research community with a powerful new lens for exploring human genetic diversity and its historical underpinnings.