关键词: aDNA admixture archaeogenetics f-statistics paleogenomics qpAdm

来  源:   DOI:10.1101/2023.11.13.566841   PDF(Pubmed)

Abstract:
Paleogenomics has expanded our knowledge of human evolutionary history. Since the 2020s, the study of ancient DNA has increased its focus on reconstructing the recent past. However, the accuracy of paleogenomic methods in answering questions of historical and archaeological importance amidst the increased demographic complexity and decreased genetic differentiation within the historical period remains an open question. We used two simulation approaches to evaluate the limitations and behavior of commonly used methods, qpAdm and the f3-statistic, on admixture inference. The first is based on branch-length data simulated from four simple demographic models of varying complexities and configurations. The second, an analysis of Eurasian history composed of 59 populations using whole-genome data modified with ancient DNA conditions such as SNP ascertainment, data missingness, and pseudo-haploidization. We show that under conditions resembling historical populations, qpAdm can identify a small candidate set of true sources and populations closely related to them. However, in typical ancient DNA conditions, qpAdm is unable to further distinguish between them, limiting its utility for resolving fine-scaled hypotheses. Notably, we find that complex gene-flow histories generally lead to improvements in the performance of qpAdm and observe no bias in the estimation of admixture weights. We offer a heuristic for admixture inference that incorporates admixture weight estimate and P-values of qpAdm models, and f3-statistics to enhance the power to distinguish between multiple plausible candidates. Finally, we highlight the future potential of qpAdm through whole-genome branch-length f2-statistics, demonstrating the improved demographic inference that could be achieved with advancements in f-statistic estimations.
摘要:
古基因组学扩展了我们对人类进化史的认识。自2020年以来,古代DNA的研究增加了对重建最近过去的关注。然而,在人口统计学复杂性增加和遗传分化减少的情况下,古基因组学方法在回答历史和考古重要性问题方面的准确性仍然是一个悬而未决的问题。我们使用了两种模拟方法来评估常用方法的局限性和行为,qpAdm和f3-统计量,关于混合推断。第一个是基于从不同复杂性和配置的四个简单人口统计学模型模拟的分支长度数据。第二个,使用全基因组数据对59个种群组成的欧亚历史进行分析,这些数据是用古老的DNA条件(如SNP确定)进行修改的,数据缺失,和伪单倍体化。我们表明,在类似历史人口的条件下,qpAdm可以识别与它们密切相关的真实来源和群体的小候选集。然而,在典型的古代DNA条件下,qpAdm无法进一步区分它们,限制了其解决精细假设的效用。值得注意的是,我们发现,复杂的基因流历史通常会导致qpAdm性能的改善,并且在混合物权重的估计中没有偏差。我们为混合推断提供了一种启发式方法,该方法结合了混合权重估计和qpAdm模型的P值,和f3-统计信息,以增强区分多个似是而非的候选人的能力。最后,我们通过全基因组分支长度F2统计来强调qpAdm的未来潜力,证明了人口统计学推断的改进,这可以通过f统计量估计的改进来实现。
公众号