population structure

人口结构
  • 文章类型: Journal Article
    尽管用于研究人口结构的主成分判别分析(DAPC)很受欢迎,很少讨论这种方法的最佳实践。在这项工作中,我提供了将DAPC应用于基因型数据集的标准化指南。一个经常被忽视的事实是,DAPC生成了一个模型,描述了由研究人员定义的一组种群之间的遗传差异。该模型的适当参数化对于获得生物学上有意义的结果至关重要。我表明,作为人口间差异预测因子的领先PC轴的数量,paxes,不应超过基因型数据集中k个有效群体预期的k-1个生物学信息PC轴。与广泛使用的比例方差准则相比,pax规范的k-1准则更合适,这通常会导致选择paxs而k-1。DAPC参数化不超过前导k-1PC轴:(i)更简约;(ii)在生物学相关预测因子上捕获最大的种群间变异;(iii)对种群结构的非预期解释较不敏感;(iv)更普遍适用于独立样本集。评估模型拟合应该是常规实践,并有助于解释人口结构。研究人员必须阐明他们的研究目标,也就是说,测试先验期望与研究从头推断的种群,因为这对如何解释他们的DAPC结果有影响。这项工作的讨论和实践建议为分子生态学界提供了在种群遗传研究中使用DAPC的路线图。
    Despite the popularity of discriminant analysis of principal components (DAPC) for studying population structure, there has been little discussion of best practice for this method. In this work, I provide guidelines for standardizing the application of DAPC to genotype data sets. An often overlooked fact is that DAPC generates a model describing genetic differences among a set of populations defined by a researcher. Appropriate parameterization of this model is critical for obtaining biologically meaningful results. I show that the number of leading PC axes used as predictors of among-population differences, paxes , should not exceed the k-1 biologically informative PC axes that are expected for k effective populations in a genotype data set. This k-1 criterion for paxes specification is more appropriate compared to the widely used proportional variance criterion, which often results in a choice of paxes  ≫ k-1. DAPC parameterized with no more than the leading k-1 PC axes: (i) is more parsimonious; (ii) captures maximal among-population variation on biologically relevant predictors; (iii) is less sensitive to unintended interpretations of population structure; and (iv) is more generally applicable to independent sample sets. Assessing model fit should be routine practice and aids interpretation of population structure. It is imperative that researchers articulate their study goals, that is, testing a priori expectations vs. studying de novo inferred populations, because this has implications on how their DAPC results should be interpreted. The discussion and practical recommendations in this work provide the molecular ecology community with a roadmap for using DAPC in population genetic investigations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    A consensus genetic map for Pinus taeda (loblolly pine) and Pinus elliottii (slash pine) was constructed by merging three previously published P. taeda maps with a map from a pseudo-backcross between P. elliottii and P. taeda. The consensus map positioned 3856 markers via genotyping of 1251 individuals from four pedigrees. It is the densest linkage map for a conifer to date. Average marker spacing was 0.6 cM and total map length was 2305 cM. Functional predictions of mapped genes were improved by aligning expressed sequence tags used for marker discovery to full-length P. taeda transcripts. Alignments to the P. taeda genome mapped 3305 scaffold sequences onto 12 linkage groups. The consensus genetic map was used to compare the genome-wide linkage disequilibrium in a population of distantly related P. taeda individuals (ADEPT2) used for association genetic studies and a multiple-family pedigree used for genomic selection (CCLONES). The prevalence and extent of LD was greater in CCLONES as compared to ADEPT2; however, extended LD with LGs or between LGs was rare in both populations. The average squared correlations, r(2), between SNP alleles less than 1 cM apart were less than 0.05 in both populations and r(2) did not decay substantially with genetic distance. The consensus map and analysis of linkage disequilibrium establish a foundation for comparative association mapping and genomic selection in P. taeda and P. elliottii.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

公众号