Selective sweeps

选择性扫描
  • 文章类型: Journal Article
    背景:层的育种强调卵相关性状的持续选择,比如产蛋,鸡蛋质量和蛋壳,这提高了他们的生产力,满足了市场的需求。随着繁殖过程的继续,层的基因组纯合性逐渐增加,导致纯合性(ROH)运行的出现。因此,ROH分析可以与其他方法结合使用以检测选择特征并鉴定与层育种中的各种重要性状相关的候选基因。
    结果:在这项研究中,我们从罗德岛红种群中的686只母鸡中获得了全基因组测序数据,该种群经历了连续15代的密集人工选择.我们进行了全基因组ROH分析,并利用多种方法来检测选择的特征。在整个人群中总共发现了141,720个ROH段,其中大多数(97.35%)长度小于3Mb。确定了23个ROH岛,它们与一些带有选择签名的区域重叠,通过多信号去相关复合方法(DCMS)检测。发现了60个基因,功能注释分析揭示了它们在生长中的可能作用,发展,免疫和信号层。此外,对44个层表型进行了包括DCMS和ROH的双尾分析,以找出个体的顶部和底部10%表型的亚组之间的基因组差异。结合GWAS的结果,我们观察到,与性状显著相关的区域在高低亚组之间也表现出选择特征.我们在GGA1的25Mb区域附近确定了与卵重显着相关的区域,该区域在低卵重亚群中表现出选择特征并具有较高的基因组纯合性。这表明该地区可能在鸡蛋重量的下降中起作用。
    结论:总之,通过对ROH的联合分析,选择签名,和GWAS,我们确定了几个与层的生产特征相关的基因组区域,层基因组的研究提供参考。
    BACKGROUND: The breeding of layers emphasizes the continual selection of egg-related traits, such as egg production, egg quality and eggshell, which enhance their productivity and meet the demand of market. As the breeding process continued, the genomic homozygosity of layers gradually increased, resulting in the emergence of runs of homozygosity (ROH). Therefore, ROH analysis can be used in conjunction with other methods to detect selection signatures and identify candidate genes associated with various important traits in layer breeding.
    RESULTS: In this study, we generated whole-genome sequencing data from 686 hens in a Rhode Island Red population that had undergone fifteen consecutive generations of intensive artificial selection. We performed a genome-wide ROH analysis and utilized multiple methods to detect signatures of selection. A total of 141,720 ROH segments were discovered in whole population, and most of them (97.35%) were less than 3 Mb in length. Twenty-three ROH islands were identified, and they overlapped with some regions bearing selection signatures, which were detected by the De-correlated composite of multiple signals methods (DCMS). Sixty genes were discovered and functional annotation analysis revealed the possible roles of them in growth, development, immunity and signaling in layers. Additionally, two-tailed analyses including DCMS and ROH for 44 phenotypes of layers were conducted to find out the genomic differences between subgroups of top and bottom 10% phenotype of individuals. Combining the results of GWAS, we observed that regions significantly associated with traits also exhibited selection signatures between the high and low subgroups. We identified a region significantly associated with egg weight near the 25 Mb region of GGA 1, which exhibited selection signatures and has higher genomic homozygosity in the low egg weight subpopulation. This suggests that the region may be play a role in the decline in egg weight.
    CONCLUSIONS: In summary, through the combined analysis of ROH, selection signatures, and GWAS, we identified several genomic regions that associated with the production traits of layers, providing reference for the study of layer genome.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    推断种群的人口统计学历史提供了对物种动态的基本见解,对于开发空模型以准确研究选择过程至关重要。然而,背景选择和选择性扫描可以在链接位点产生基因组特征,模拟或掩盖与历史种群数量变化相关的信号.虽然选择的关联效应引入的理论偏见已经得到了很好的确立,目前尚不清楚,在典型的实证分析中,基于ARG的人口统计学推断方法是否容易受到这些影响的误判.为了解决这个问题,我们开发了人类和果蝇种群的高度逼真的正向模拟,包括经验估计的基因密度变异性,突变率,重组率,净化和正选择,在不同的历史人口情况下,使用基于家谱的方法广泛评估选择对人口统计推断的影响。我们的结果表明,选择的关联效应对人口统计学推断的影响最小,尽管它可能会在具有相似基因组结构和种群参数的种群中引起错误的推断,这些种群经历了更频繁的反复扫描。我们发现,通过基于ARG的方法对D.melanogaster种群进行准确的人口统计学推断会受到普遍背景选择的影响,导致最近人口扩张的虚假推论,而反复的扫荡可能会进一步恶化,取决于有益突变的比例和强度。当使用基于ARG的方法推断非人类种群的种群历史时,需要谨慎并进行针对特定物种的模拟的额外测试,以避免由于选择的关联效应而导致的错误推断。
    Inferring the demographic history of populations provides fundamental insights into species dynamics and is essential for developing a null model to accurately study selective processes. However, background selection and selective sweeps can produce genomic signatures at linked sites that mimic or mask signals associated with historical population size change. While the theoretical biases introduced by the linked effects of selection have been well established, it is unclear whether ancestral recombination graph (ARG)-based approaches to demographic inference in typical empirical analyses are susceptible to misinference due to these effects. To address this, we developed highly realistic forward simulations of human and Drosophila melanogaster populations, including empirically estimated variability of gene density, mutation rates, recombination rates, purifying, and positive selection, across different historical demographic scenarios, to broadly assess the impact of selection on demographic inference using a genealogy-based approach. Our results indicate that the linked effects of selection minimally impact demographic inference for human populations, although it could cause misinference in populations with similar genome architecture and population parameters experiencing more frequent recurrent sweeps. We found that accurate demographic inference of D. melanogaster populations by ARG-based methods is compromised by the presence of pervasive background selection alone, leading to spurious inferences of recent population expansion, which may be further worsened by recurrent sweeps, depending on the proportion and strength of beneficial mutations. Caution and additional testing with species-specific simulations are needed when inferring population history with non-human populations using ARG-based approaches to avoid misinference due to the linked effects of selection.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    云杉和蒙古云杉的分类学分类长期以来一直存在争议。为了调查遗传相关性,进化史,和这些物种的种群历史动态,基因分型测序(GBS)技术用于获得全基因组单核苷酸多态性(SNP)标记,随后被用来评估人口结构,人口动态,和适应性分化。在基因组水平上的系统发育和种群结构分析表明,尽管蒙古假单胞菌的祖先是M.meyeri和M.koraiensis的杂种,蒙古云杉是一种独立的云杉物种。此外,蒙古P与P.meyeri的关系比与M.koraiensis的关系更密切,这与它的地理分布是一致的。在P.meyeri和P.mongolica之间有多达八个种间和种内基因流实例。P.meyeri和P.mongolica的有效种群规模普遍减少,Maxent模型显示,从最后一次冰川最大值(LGM)到现在,它们的栖息地面积最初减少,然后增加。然而,在未来的气候情景下,两种物种的栖息地面积预计都会减少,特别是在高排放的情况下,这将使蒙古假单胞菌面临灭绝的危险,迫切需要保护。局部适应促进了P.meyeri和P.mongolica之间的分化。基因型-环境关联分析揭示了96,543个与环境因素相关的SNP,主要与植物对水分和温度的适应有关。选择性扫描显示,在P.meyeri中选择的基因,蒙古假单胞菌和红单胞菌主要在维管植物中与开花有关,果实发育,和抗压力。这项研究增强了我们对云杉物种分类的理解,并为未来的遗传改良和物种保护工作提供了基础。
    The taxonomic classification of Picea meyeri and P. mongolica has long been controversial. To investigate the genetic relatedness, evolutionary history, and population history dynamics of these species, genotyping-by-sequencing (GBS) technology was utilized to acquire whole-genome single nucleotide polymorphism (SNP) markers, which were subsequently used to assess population structure, population dynamics, and adaptive differentiation. Phylogenetic and population structural analyses at the genomic level indicated that although the ancestor of P. mongolica was a hybrid of P. meyeri and P. koraiensis, P. mongolica is an independent Picea species. Additionally, P. mongolica is more closely related to P. meyeri than to P. koraiensis, which is consistent with its geographic distribution. There were up to eight instances of interspecific and intraspecific gene flow between P. meyeri and P. mongolica. The P. meyeri and P. mongolica effective population sizes generally decreased, and Maxent modeling revealed that from the Last Glacial Maximum (LGM) to the present, their habitat areas decreased initially and then increased. However, under future climate scenarios, the habitat areas of both species were projected to decrease, especially under high-emission scenarios, which would place P. mongolica at risk of extinction and in urgent need of protection. Local adaptation has promoted differentiation between P. meyeri and P. mongolica. Genotype‒environment association analysis revealed 96,543 SNPs associated with environmental factors, mainly related to plant adaptations to moisture and temperature. Selective sweeps revealed that the selected genes among P. meyeri, P. mongolica and P. koraiensis are primarily associated in vascular plants with flowering, fruit development, and stress resistance. This research enhances our understanding of Picea species classification and provides a basis for future genetic improvement and species conservation efforts.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Preprint
    人口科统计,估计焦点人群相对于两个外组的分支长度,已被用作基于FST的全基因组扫描的替代方法,用于识别与局部选择性扫描相关的基因座。除了原始人口分支统计(PBS)之外,随后提出了分支重新划分:归一化人口分支统计量(PBSn1),相对于同一基因座上的外组分支长度调整焦点分支长度,和人口分支过剩(PBE),其中还包括其他基因座的中位分支长度。已提出PBSn1和PBE对由背景选择或地理上普遍存在的正选择而不是局部选择性扫描产生的等位基因频率差异较不敏感。然而,部门统计数据的准确性和统计能力尚未得到系统评估。要做到这一点,我们模拟具有代表性的大型和小型群体的基因组,在遗传漂移或背景选择(使用变量Ne近似)下进化的不同比例的位点,本地选择性扫描,和地理上平行的选择性扫描。然后,我们通过FST和每个分支统计量评估将局部选择性扫描基因座正确识别为异常值的概率。我们发现,在识别本地扫描时,分支统计信息的表现始终优于FST。当引入背景选择和/或并行扫描时,PBSn1和特别是PBE以高于PBS的频率正确地识别其顶部异常值中的局部扫描。这些结果验证了重新缩放的分支统计数据(如PBE)的更大特异性,以检测特定人群的阳性选择。支持它们在侧重于局部适应的基因组研究中的使用。
    人口分支统计被广泛用于全基因组扫描,以识别与局部适应相关的基因座。这项研究发现,在广泛的人口统计学参数和进化模型下,分支统计在识别局部选择性扫描方面比FST更准确。它还表明,某些分支统计数据提高了将局部适应与其他自然选择模型区分开来的能力。
    Population branch statistics, which estimate the branch lengths of focal populations with respect to two outgroups, have been used as an alternative to FST-based genome-wide scans for identifying loci associated with local selective sweeps. In addition to the original population branch statistic (PBS), there are subsequently proposed branch rescalings: normalized population branch statistic (PBSn1), which adjusts focal branch length with respect to outgroup branch lengths at the same locus, and population branch excess (PBE), which also incorporates median branch lengths at other loci. PBSn1 and PBE have been proposed to be less sensitive to allele frequency divergence generated by background selection or geographically ubiquitous positive selection rather than local selective sweeps. However, the accuracy and statistical power of branch statistics have not been systematically assessed. To do so, we simulate genomes in representative large and small populations with varying proportions of sites evolving under genetic drift or background selection (approximated using variable Ne), local selective sweeps, and geographically parallel selective sweeps. We then assess the probability that local selective sweep loci are correctly identified as outliers by FST and by each of the branch statistics. We find that branch statistics consistently outperform FST at identifying local sweeps. When background selection and/or parallel sweeps are introduced, PBSn1 and especially PBE correctly identify local sweeps among their top outliers at a higher frequency than PBS. These results validate the greater specificity of rescaled branch statistics such as PBE to detect population-specific positive selection, supporting their use in genomic studies focused on local adaptation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在全球气候变化的背景下,了解当地适应的遗传基础至关重要。红树林,作为热带和亚热带海岸线潮间带的耐盐树木和灌木,特别容易受到气候变化的影响。Kandeliaobovata,最耐寒的红树林,经历了不耐寒的生态物种形成,KandeliaCandel,与南中国海地理隔离。在这项研究中,我们对中国东南沿海的双叶夜蛾种群进行了全基因组重测序,阐明红树林局部适应气候的遗传基础。我们的分析揭示了三个双歧杆菌种群中强大的种群结构,复杂的人口历史涉及人口扩张,瓶颈,和基因流动。全基因组扫描揭示了成对种群中高度分化区域的选择性扫描的明显模式,与南部人口相比,北部人口的特征更强。此外,确定了温度相关变量的显著基因型-环境关联,而没有检测到降水的关联。确定了一组39个高置信度的候选基因,这些基因是双叶双歧杆菌局部适应的基础。与通过比较K.obovata及其不耐受寒冷的亲戚K.candel检测到的选择基因不同。这些结果极大地有助于我们了解K.obovata局部适应的遗传基础,并为塑造红树林种群遗传多样性以应对气候变化的进化过程提供了有价值的见解。
    Understanding the genetic basis of local adaption is crucial in the context of global climate change. Mangroves, as salt-tolerant trees and shrubs in the intertidal zone of tropical and subtropical coastlines, are particularly vulnerable to climate change. Kandelia obovata, the most cold-tolerant mangrove species, has undergone ecological speciation from its cold-intolerant counterpart, Kandelia candel, with geographic separation by the South China Sea. In this study, we conducted whole-genome re-sequencing of K. obovata populations along China\'s southeast coast, to elucidate the genetic basis responsible for mangrove local adaptation to climate. Our analysis revealed a strong population structure among the three K. obovata populations, with complex demographic histories involving population expansion, bottleneck, and gene flow. Genome-wide scans unveiled pronounced patterns of selective sweeps in highly differentiated regions among pairwise populations, with stronger signatures observed in the northern populations compared to the southern population. Additionally, significant genotype-environment associations for temperature-related variables were identified, while no associations were detected for precipitation. A set of 39 high-confidence candidate genes underlying local adaptation of K. obovata were identified, which are distinct from genes under selection detected by comparison between K. obovata and its cold-intolerant relative K. candel. These results significantly contribute to our understanding of the genetic underpinnings of local adaptation in K. obovata and provide valuable insights into the evolutionary processes shaping the genetic diversity of mangrove populations in response to climate change.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    粗皮鱼(Trachidemusfasciatus)是中国濒临灭绝的鱼类。近年来,人工育种技术取得了重大进展,通过增强计划和释放青少年,粗皮独石的种群已经在自然环境中恢复。然而,释放的粗皮sculpin对野生种群的遗传结构和多样性的影响尚不清楚。基于不同类型和数量的分子标记的遗传多样性分析研究得出的结果不一致。在这项研究中,通过对两个养殖种群和一个野生种群的全基因组重测序,我们获得了2,610,157个高质量SNP和494,698个InDels。与野生种群相比,两个养殖种群均显示出一致的基因组多态性水平,并且连锁程度略有增加。两个养殖种群的种群结构与野生种群的种群结构不同,但遗传分化程度较低(总体平均Fst=0.015)。选择性扫描分析表明,在两个养殖种群中选择了523,529个基因,和KEGG富集分析表明,所选基因与氨基酸代谢有关,这可能是由人工喂养引起的。这项研究的发现为现有的基因组资源提供了有价值的补充,以帮助保护粗皮独石种群。
    The roughskin sculpin (Trachidermus fasciatus) is an endangered fish species in China. In recent years, artificial breeding technology has made significant progress, and the population of roughskin sculpin has recovered in the natural environment through enhancement programs and the release of juveniles. However, the effects of released roughskin sculpin on the genetic structure and diversity of wild populations remain unclear. Studies on genetic diversity analysis based on different types and numbers of molecular markers have yielded inconsistent results. In this study, we obtained 2,610,157 high-quality SNPs and 494,698 InDels through whole-genome resequencing of two farmed populations and one wild population. Both farmed populations showed consistent levels of genomic polymorphism and a slight increase in linkage compared with wild populations. The population structure of the two farmed populations was distinct from that of the wild population, but the degree of genetic differentiation was low (overall average Fst = 0.015). Selective sweep analysis showed that 523,529 genes were selected in the two farmed populations, and KEGG enrichment analysis showed that the selected genes were related to amino acid metabolism, which might be caused by artificial feeding. The findings of this study provide valuable additions to the existing genomic resources to help conserve roughskin sculpin populations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    从群体基因组数据中检测选择性扫描通常依赖于以下前提:所讨论的有益突变在采样时间附近已经固定。正如前面已经表明的那样,检测选择性扫描的能力强烈依赖于自固定以来的时间以及选择的强度。情况自然是如此强烈,最近的扫荡留下了最强烈的签名。然而,生物学现实是有益的突变以一定的速度进入人群,部分确定扫描事件之间的平均等待时间,从而确定其年龄分布。因此,一个重要的问题仍然是关于检测复发性选择性扫描的能力,当它们通过现实的突变率建模并作为健身效应(DFE)的现实分布的一部分时,而不是单身,最近,在纯中性背景上的孤立事件,这是更常见的建模方法。在这里,我们使用时间前向模拟来研究常用扫描统计信息的性能,在包含净化和背景选择的更现实的进化基线模型的背景下,人口规模变化,突变和重组率异质性。结果表明这些过程的重要相互作用,在解释选择扫描时需要谨慎;特别是,在大部分评估的参数空间中,假阳性率超过真阳性率,和选择性扫描通常是不可检测的,除非选择的强度是非常强的。
    The detection of selective sweeps from population genomic data often relies on the premise that the beneficial mutations in question have fixed very near the sampling time. As it has been previously shown that the power to detect a selective sweep is strongly dependent on the time since fixation as well as the strength of selection, it is naturally the case that strong, recent sweeps leave the strongest signatures. However, the biological reality is that beneficial mutations enter populations at a rate, one that partially determines the mean wait time between sweep events and hence their age distribution. An important question thus remains about the power to detect recurrent selective sweeps when they are modeled by a realistic mutation rate and as part of a realistic distribution of fitness effects, as opposed to a single, recent, isolated event on a purely neutral background as is more commonly modeled. Here we use forward-in-time simulations to study the performance of commonly used sweep statistics, within the context of more realistic evolutionary baseline models incorporating purifying and background selection, population size change, and mutation and recombination rate heterogeneity. Results demonstrate the important interplay of these processes, necessitating caution when interpreting selection scans; specifically, false-positive rates are in excess of true-positive across much of the evaluated parameter space, and selective sweeps are often undetectable unless the strength of selection is exceptionally strong.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:在驯化和随后的改良过程中,对植物进行所需性状的强化阳性选择。选择目标的确定对于将来有针对性地扩大育种计划中的多样性很重要。黑麦(SecalecerealeL.)是一种与小麦密切相关的谷物,它是中部的重要作物,东欧和北欧。该研究的目的是(i)根据高密度确定不同的黑麦种质群体,对一组478份黑麦种质的遗传多样性进行全基因组分析,涵盖了属内的全部多样性,从野生种质到杂种育种中使用的自交系,和(ii)在已建立的栽培黑麦种质和通过选择靶向的推定候选基因组中鉴定选择性扫描。
    结果:基于高质量SNP(DArTseq)标记的种群结构和遗传多样性分析揭示了Secale属中存在三种复合物:S.sylvestre,S、严格和谷物/vavilovii,S.sylvestre的相对狭窄的多样性,严格链球菌的多样性非常高,和S.vavilovii中强阳性选择的签名。在栽培ryes中,我们检测到遗传簇的存在以及改善状态对聚类的影响。黑麦地方品种代表着繁殖的变异库,尤其是来自土耳其的一组独特的地方品种,作为未开发变异的来源,应该特别感兴趣。栽培种质中的选择性扫描检测确定了13个扫描区域内的133个异常位置和170个推定的候选基因相关,其中,对各种环境刺激(如病原体,干旱,冷),植物育性和繁殖(花粉精子细胞分化,花粉成熟,花粉管生长),以及植物生长和生物量生产。
    结论:我们的研究为黑麦种质资源的有效管理提供了有价值的信息,这可以帮助确保适当保护其遗传潜力,并提供了许多新的候选基因,这些基因是通过在栽培黑麦中进行选择而靶向的,用于进一步的功能表征和等位基因多样性研究。
    BACKGROUND: During domestication and subsequent improvement plants were subjected to intensive positive selection for desirable traits. Identification of selection targets is important with respect to the future targeted broadening of diversity in breeding programmes. Rye (Secale cereale L.) is a cereal that is closely related to wheat, and it is an important crop in Central, Eastern and Northern Europe. The aim of the study was (i) to identify diverse groups of rye accessions based on high-density, genome-wide analysis of genetic diversity within a set of 478 rye accessions, covering a full spectrum of diversity within the genus, from wild accessions to inbred lines used in hybrid breeding, and (ii) to identify selective sweeps in the established groups of cultivated rye germplasm and putative candidate genes targeted by selection.
    RESULTS: Population structure and genetic diversity analyses based on high-quality SNP (DArTseq) markers revealed the presence of three complexes in the Secale genus: S. sylvestre, S. strictum and S. cereale/vavilovii, a relatively narrow diversity of S. sylvestre, very high diversity of S. strictum, and signatures of strong positive selection in S. vavilovii. Within cultivated ryes we detected the presence of genetic clusters and the influence of improvement status on the clustering. Rye landraces represent a reservoir of variation for breeding, and especially a distinct group of landraces from Turkey should be of special interest as a source of untapped variation. Selective sweep detection in cultivated accessions identified 133 outlier positions within 13 sweep regions and 170 putative candidate genes related, among others, to response to various environmental stimuli (such as pathogens, drought, cold), plant fertility and reproduction (pollen sperm cell differentiation, pollen maturation, pollen tube growth), and plant growth and biomass production.
    CONCLUSIONS: Our study provides valuable information for efficient management of rye germplasm collections, which can help to ensure proper safeguarding of their genetic potential and provides numerous novel candidate genes targeted by selection in cultivated rye for further functional characterisation and allelic diversity studies.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    美国水貂(Neovisonvison)是原产于北美的一种半水生物种,现在在中国很普遍。然而,我国对水貂遗传多样性的认识还很有限。在这项研究中,我们调查了中国三个不同水貂养殖场的五种不同颜色的水貂种群的遗传多样性并鉴定了显着的单核苷酸多态性(SNPs)。使用双消化限制性位点相关DNA测序,我们确定了总共130万个SNP。过滤SNP后,系统发育树,Fst,主成分,并进行了种群结构分析。结果表明,红水貂和黑水貂分组,与所有其他颜色类型的单独聚类。种群差异指数(Fst)研究证实,不同的水貂种群是不同的(K=4)。对具有不同外套颜色的两个种群进行了选择签名分析,和2300个基因被发现有一个明确的选择特征。对具有选择特征的基因进行基因本体论(GO)分类和京都基因和基因组百科全书(KEGG)富集分析,结果显示,具有选择特征的基因在黑素生成途径中富集。这些研究的发现为在现实世界的实际水貂养殖中改善育种和遗传资源保护奠定了基础。
    The American mink (Neovison vison) is a semiaquatic species of Mustelid native to North America that is now widespread in China. However, the knowledge of genetic diversity of mink in China is still limited. In this study, we investigated the genetic diversity and identified significant single nucleotide polymorphisms (SNPs) in mink populations of five different color types in three different mink farms in China. Using double-digest restriction site-associated DNA sequencing, we identified a total of 1.3 million SNPs. After filtering the SNPs, phylogenetic tree, Fst, principal component, and population structure analyses were performed. The results demonstrated that red mink and black mink grouped, with separate clustering of all other color types. The population divergence index (Fst) study confirmed that different mink populations were distinct (K = 4). Two populations with different coat colors were subjected to the selection signature analysis, and 2300 genes were found to have a clear selection signature. The genes with a selection signature were subjected to Gene Ontology (GO) categorization and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis, the results revealed that the genes with a selection signature were enriched in the melanogenesis pathway. These study\'s findings have set the stage for improved breeding and conservation of genetic resources in real-world practical mink farming.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    尽管经过几十年的研究,识别选择性扫描,阳性选择的基因组足迹,仍然是群体遗传学的核心问题。在为解决这一任务而开发的无数方法中,很少有人设计来利用基因组时间序列数据的潜力。这是因为在大多数自然种群的种群遗传研究中,只能采样单个时间段。测序技术的最新进展,包括改进古代DNA的提取和测序,使人口的重复采样成为可能,允许更直接地分析最近的进化动态。由于测序的成本和通量的改善,具有较短世代时间的生物体的连续取样也变得更加可行。考虑到这些进步,在这里,我们介绍Timesweeper,一种快速准确的基于卷积神经网络的工具,用于识别由群体的多个基因组采样组成的数据中的选择性扫描。Timesweeper种群基因组时间序列数据,通过首先在适合感兴趣数据的人口统计学模型下模拟训练数据,在所述模拟上训练一维卷积神经网络,并推断该序列化数据集中的哪些多态性是完成或正在进行的选择性扫描的直接目标。我们证明了Timesweeper在多个模拟人口统计和抽样场景下是准确的,识别高分辨率的选定变体,并且比现有方法更准确地估计选择系数。总之,我们表明,当基因组时间序列数据可用时,关于自然选择的更准确的推断是可能的;由于古代样本的测序和现有种群的重复采样,这些数据将在未来几年继续增殖,以及经常生成时间序列数据的实验进化种群。因此,诸如Timesweeper之类的方法学进步有可能帮助解决关于正向选择在基因组中的作用的争议。我们提供Timesweeper作为Python包,供社区使用。
    Despite decades of research, identifying selective sweeps, the genomic footprints of positive selection, remains a core problem in population genetics. Of the myriad methods that have been developed to tackle this task, few are designed to leverage the potential of genomic time-series data. This is because in most population genetic studies of natural populations, only a single period of time can be sampled. Recent advancements in sequencing technology, including improvements in extracting and sequencing ancient DNA, have made repeated samplings of a population possible, allowing for more direct analysis of recent evolutionary dynamics. Serial sampling of organisms with shorter generation times has also become more feasible due to improvements in the cost and throughput of sequencing. With these advances in mind, here we present Timesweeper, a fast and accurate convolutional neural network-based tool for identifying selective sweeps in data consisting of multiple genomic samplings of a population over time. Timesweeper analyzes population genomic time-series data by first simulating training data under a demographic model appropriate for the data of interest, training a one-dimensional convolutional neural network on said simulations, and inferring which polymorphisms in this serialized data set were the direct target of a completed or ongoing selective sweep. We show that Timesweeper is accurate under multiple simulated demographic and sampling scenarios, identifies selected variants with high resolution, and estimates selection coefficients more accurately than existing methods. In sum, we show that more accurate inferences about natural selection are possible when genomic time-series data are available; such data will continue to proliferate in coming years due to both the sequencing of ancient samples and repeated samplings of extant populations with faster generation times, as well as experimentally evolved populations where time-series data are often generated. Methodological advances such as Timesweeper thus have the potential to help resolve the controversy over the role of positive selection in the genome. We provide Timesweeper as a Python package for use by the community.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号