population genetic methods

  • 文章类型: Journal Article
    沙门氏菌是人类食源性疾病的主要病因之一。它是全世界特有的,将不同的动物和动物食品作为感染的宿主和媒介。确定沙门氏菌的动物宿主和潜在的传播途径对于预防和控制至关重要。源归因有很多方法,每个都使用不同的统计模型和数据流。一些旨在识别动物水库,而其他人则旨在确定暴露发生的点。随着全基因组测序(WGS)技术的进步,新的来源归因模型将极大地受益于WGS获得的鉴别力。这篇综述讨论了一些关键的来源归因方法及其数学和统计工具。我们还重点介绍了利用WGS进行来源归因的最新研究,并讨论了开发新WGS方法的开放问题和挑战。我们的目标是更好地了解这些方法的现状,并应用于沙门氏菌和其他食源性病原体,这些病原体是家禽和人类部门的常见疾病来源。
    Salmonella is one of the main causes of human foodborne illness. It is endemic worldwide, with different animals and animal-based food products as reservoirs and vehicles of infection. Identifying animal reservoirs and potential transmission pathways of Salmonella is essential for prevention and control. There are many approaches for source attribution, each using different statistical models and data streams. Some aim to identify the animal reservoir, while others aim to determine the point at which exposure occurred. With the advance of whole-genome sequencing (WGS) technologies, new source attribution models will greatly benefit from the discriminating power gained with WGS. This review discusses some key source attribution methods and their mathematical and statistical tools. We also highlight recent studies utilizing WGS for source attribution and discuss open questions and challenges in developing new WGS methods. We aim to provide a better understanding of the current state of these methodologies with application to Salmonella and other foodborne pathogens that are common sources of illness in the poultry and human sectors.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    尽管用于研究人口结构的主成分判别分析(DAPC)很受欢迎,很少讨论这种方法的最佳实践。在这项工作中,我提供了将DAPC应用于基因型数据集的标准化指南。一个经常被忽视的事实是,DAPC生成了一个模型,描述了由研究人员定义的一组种群之间的遗传差异。该模型的适当参数化对于获得生物学上有意义的结果至关重要。我表明,作为人口间差异预测因子的领先PC轴的数量,paxes,不应超过基因型数据集中k个有效群体预期的k-1个生物学信息PC轴。与广泛使用的比例方差准则相比,pax规范的k-1准则更合适,这通常会导致选择paxs而k-1。DAPC参数化不超过前导k-1PC轴:(i)更简约;(ii)在生物学相关预测因子上捕获最大的种群间变异;(iii)对种群结构的非预期解释较不敏感;(iv)更普遍适用于独立样本集。评估模型拟合应该是常规实践,并有助于解释人口结构。研究人员必须阐明他们的研究目标,也就是说,测试先验期望与研究从头推断的种群,因为这对如何解释他们的DAPC结果有影响。这项工作的讨论和实践建议为分子生态学界提供了在种群遗传研究中使用DAPC的路线图。
    Despite the popularity of discriminant analysis of principal components (DAPC) for studying population structure, there has been little discussion of best practice for this method. In this work, I provide guidelines for standardizing the application of DAPC to genotype data sets. An often overlooked fact is that DAPC generates a model describing genetic differences among a set of populations defined by a researcher. Appropriate parameterization of this model is critical for obtaining biologically meaningful results. I show that the number of leading PC axes used as predictors of among-population differences, paxes , should not exceed the k-1 biologically informative PC axes that are expected for k effective populations in a genotype data set. This k-1 criterion for paxes specification is more appropriate compared to the widely used proportional variance criterion, which often results in a choice of paxes  ≫ k-1. DAPC parameterized with no more than the leading k-1 PC axes: (i) is more parsimonious; (ii) captures maximal among-population variation on biologically relevant predictors; (iii) is less sensitive to unintended interpretations of population structure; and (iv) is more generally applicable to independent sample sets. Assessing model fit should be routine practice and aids interpretation of population structure. It is imperative that researchers articulate their study goals, that is, testing a priori expectations vs. studying de novo inferred populations, because this has implications on how their DAPC results should be interpreted. The discussion and practical recommendations in this work provide the molecular ecology community with a roadmap for using DAPC in population genetic investigations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号