关键词: f-statistics aDNA admixture ancient DNA archaeogenetics paleogenomics qpAdm

Mesh : DNA, Ancient / analysis Humans Models, Genetic Genetics, Population / methods Gene Flow Polymorphism, Single Nucleotide Genome, Human Evolution, Molecular

来  源:   DOI:10.1093/genetics/iyae110   PDF(Pubmed)

Abstract:
Our knowledge of human evolutionary history has been greatly advanced by paleogenomics. Since the 2020s, the study of ancient DNA has increasingly focused on reconstructing the recent past. However, the accuracy of paleogenomic methods in resolving questions of historical and archaeological importance amidst the increased demographic complexity and decreased genetic differentiation remains an open question. We evaluated the performance and behavior of two commonly used methods, qpAdm and the f3-statistic, on admixture inference under a diversity of demographic models and data conditions. We performed two complementary simulation approaches-firstly exploring a wide demographic parameter space under four simple demographic models of varying complexities and configurations using branch-length data from two chromosomes-and secondly, we analyzed a model of Eurasian history composed of 59 populations using whole-genome data modified with ancient DNA conditions such as SNP ascertainment, data missingness, and pseudohaploidization. We observe that population differentiation is the primary factor driving qpAdm performance. Notably, while complex gene flow histories influence which models are classified as plausible, they do not reduce overall performance. Under conditions reflective of the historical period, qpAdm most frequently identifies the true model as plausible among a small candidate set of closely related populations. To increase the utility for resolving fine-scaled hypotheses, we provide a heuristic for further distinguishing between candidate models that incorporates qpAdm model P-values and f3-statistics. Finally, we demonstrate a significant performance increase for qpAdm using whole-genome branch-length f2-statistics, highlighting the potential for improved demographic inference that could be achieved with future advancements in f-statistic estimations.
摘要:
我们对人类进化史的认识已被古基因组学大大推进。自2020年以来,古代DNA的研究越来越集中在重建最近的过去。然而,在人口统计复杂性增加和遗传分化减少的情况下,古基因组学方法在解决历史和考古重要性问题方面的准确性仍然是一个悬而未决的问题。我们评估了两种常用方法的性能和行为,qpAdm和f3统计量,关于人口统计模型和数据条件多样性下的混合推断。我们进行了两种互补的模拟方法-首先在四个简单的人口统计模型下探索广泛的人口统计参数空间,这些模型具有不同的复杂性和配置,使用来自两个染色体的分支长度数据-其次,我们分析了一个由59个种群组成的欧亚历史模型,使用全基因组数据,这些数据是用古老的DNA条件(如SNP确定)修改的,数据缺失,和伪单倍体化。我们观察到人口分化是驱动qpAdm表现的主要因素。值得注意的是,虽然复杂的基因流历史会影响哪些模型被归类为合理的,它们不会降低整体性能。在反映历史时期的条件下,qpAdm最频繁地将真实模型识别为在一小组密切相关的群体中合理的。为了增加解决精细比例假设的效用,我们提供了一种启发式方法,用于进一步区分包含qpAdm模型P值和f3统计量的候选模型。最后,我们证明了使用全基因组分支长度f2统计量的qpAdm的性能显着增加,强调了改善人口统计学推断的潜力,这可以通过未来f统计估计的进步来实现。
公众号