关键词: ancient genomics aurochs cattle domestication imputation

Mesh : Animals Cattle / genetics Polymorphism, Single Nucleotide Genome DNA, Ancient / analysis Haplotypes Genotype Genomics / methods

来  源:   DOI:10.1093/molbev/msae076   PDF(Pubmed)

Abstract:
Ancient genomic analyses are often restricted to utilizing pseudohaploid data due to low genome coverage. Leveraging low-coverage data by imputation to calculate phased diploid genotypes that enables haplotype-based interrogation and single nucleotide polymorphism (SNP) calling at unsequenced positions is highly desirable. This has not been investigated for ancient cattle genomes despite these being compelling subjects for archeological, evolutionary, and economic reasons. Here, we test this approach by sequencing a Mesolithic European aurochs (18.49×; 9,852 to 9,376 calBCE) and an Early Medieval European cow (18.69×; 427 to 580 calCE) and combine these with published individuals: two ancient and three modern. We downsample these genomes (0.25×, 0.5×, 1.0×, and 2.0×) and impute diploid genotypes, utilizing a reference panel of 171 published modern cattle genomes that we curated for 21.7 million (Mn) phased SNPs. We recover high densities of correct calls with an accuracy of >99.1% at variant sites for the lowest downsample depth of 0.25×, increasing to >99.5% for 2.0× (transversions only, minor allele frequency [MAF] ≥ 2.5%). The recovery of SNPs correlates with coverage; on average, 58% of sites are recovered for 0.25× increasing to 87% for 2.0×, utilizing an average of 3.5 million (Mn) transversions (MAF ≥2.5%), even in the aurochs, despite the highest temporal distance from the modern reference panel. Our imputed genomes behave similarly to directly called data in allele frequency-based analyses, for example consistently identifying runs of homozygosity >2 Mb, including a long homozygous region in the Mesolithic European aurochs.
摘要:
由于基因组覆盖率低,古代基因组分析通常仅限于利用假单倍体数据。通过插补利用低覆盖率数据来计算分阶段的二倍体基因型,从而在未测序的位置实现基于单倍型的询问和SNP调用是非常理想的。尽管这些是考古学的令人信服的主题,但尚未对古代牛基因组进行调查,进化和经济原因。在这里,我们通过对中石器时代的欧洲aurochs(18.49x;9852-9376calBCE)进行测序来测试这种方法,欧洲中世纪早期母牛(18.69x;427-580calCE),并将这些与已出版的个人结合起来;两个古代和三个现代。我们对这些基因组进行下采样(0.25倍,0.5x,1.0x,2.0x)和估计二倍体基因型,利用171个已发表的现代牛基因组的参考小组,我们策划了2170万(Mn)阶段的单核苷酸多态性(SNP)。我们在变异位点处恢复高密度的正确调用,准确率>99.1%,最低采样深度为0.25x,对于2.0倍,增加到>99.5%(仅转换,次要等位基因频率(MAF)≥2.5%)。SNP的恢复与覆盖率相关,平均58%的网站以0.25倍的速度恢复,增加到2.0倍的87%,利用平均350万(Mn)转化(MAF≥2.5%),即使在aurochs,尽管与现代参考面板的时间距离最高。我们估算的基因组的行为类似于基于等位基因频率的分析中直接称为数据;例如,一致地识别纯合性>2mb的运行,包括中石器时代欧洲aurochs中的一个长纯合区域。
公众号