1000 Genomes

1000 个基因组
  • 文章类型: Journal Article
    用于人类白细胞抗原(HLA)分型的基于SNP的插补方法利用了主要组织相容性复合物(MHC)区域内的单倍型结构。这些方法使用密集的SNP基因型预测HLA经典等位基因,通常在全基因组关联研究(GWAS)中使用的基于阵列的平台上发现。HLA经典等位基因的分析可以在没有额外成本的情况下在当前SNP数据集上进行。这里,我们描述了HIBAG的工作流程,一种带有属性装袋的插补方法,使用SNP数据推断样本的HLA经典等位基因。提供了两个示例来演示使用1000Genomes项目最新版本的公共HLA和SNP数据的功能:使用GWAS中预先构建的分类器进行基因型填补,和模型训练,以创建新的预测模型。GPU实现有助于模型构建,使它比单线程实现快数百倍。
    SNP-based imputation approaches for human leukocyte antigen (HLA) typing take advantage of the haplotype structure within the major histocompatibility complex (MHC) region. These methods predict HLA classical alleles using dense SNP genotypes, commonly found on array-based platforms used in genome-wide association studies (GWAS). The analysis of HLA classical alleles can be conducted on current SNP datasets at no additional cost. Here, we describe the workflow of HIBAG, an imputation method with attribute bagging, to infer a sample\'s HLA classical alleles using SNP data. Two examples are offered to demonstrate the functionality using public HLA and SNP data from the latest release of the 1000 Genomes project: genotype imputation using pre-built classifiers in a GWAS, and model training to create a new prediction model. The GPU implementation facilitates model building, making it hundreds of times faster compared to the single-threaded implementation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目前,与多发性硬化(MS)风险密切相关的遗传变异位于主要组织相容性复合体中.这包括HLA-DRB1基因座的DRB1*15:01和DRB1*15:03等位基因,后者仅限于非洲人群;HLA-DQB1基因座上的DQB1*06:02等位基因与DRB1*15:01高度连锁不平衡(LD);HLA-A基因座上的保护性等位基因A*02:01。HLA等位基因的鉴定是由共同遗传的(\'标签\')单核苷酸多态性(SNP);然而,SNP验证通常不在发现群体之外进行。我们检查了在具有分型HLA数据的1000基因组组中包括的2,502名健康受试者中报告的具有这些等位基因的高LD的19个SNP。3个指标的检查(LDR2值,敏感性和特异性,次要等位基因频率)显示很少具有高标记性能的SNP。所有SNP都检查了标签DRB1*15:01在英国人口中处于完美LD;在5个欧洲人中的4个中,有3个显示出高标记性能,和4个美国人口中的2个。对于DQB1*06:02,没有先前验证的标签SNP,我们表明,rs3135388在一个南亚,一个美国人,一个欧洲人口。我们首次确定rs2844821在包括非裔美国人在内的7个非洲人口中的5个中对A*02:01具有较高的标记性能,5个欧洲人口中的4个。这些结果为选择具有高标记性能的SNP以评估不同群体的HLA等位基因提供了基础。对于MS风险以及其他疾病和病症。
    Currently, the genetic variants strongly associated with risk for Multiple Sclerosis (MS) are located in the Major Histocompatibility Complex. This includes DRB1*15:01 and DRB1*15:03 alleles at the HLA-DRB1 locus, the latter restricted to African populations; the DQB1*06:02 allele at the HLA-DQB1 locus which is in high linkage disequilibrium (LD) with DRB1*15:01; and protective allele A*02:01 at the HLA-A locus. HLA allele identification is facilitated by co-inherited (\'tag\') single nucleotide polymorphisms (SNPs); however, SNP validation is not typically done outside of the discovery population. We examined 19 SNPs reported to be in high LD with these alleles in 2,502 healthy subjects included in the 1000 Genomes panel having typed HLA data. Examination of 3 indices (LD R2 values, sensitivity and specificity, minor allele frequency) revealed few SNPs with high tagging performance. All SNPs examined that tag DRB1*15:01 were in perfect LD in the British population; three showed high tagging performance in 4 of the 5 European, and 2 of the 4 American populations. For DQB1*06:02, with no previously validated tag SNPs, we show that rs3135388 has high tagging performance in one South Asian, one American, and one European population. We identify for the first time that rs2844821 has high tagging performance for A*02:01 in 5 of 7 African populations including African Americans, and 4 of the 5 European populations. These results provide a basis for selecting SNPs with high tagging performance to assess HLA alleles across diverse populations, for MS risk as well as for other diseases and conditions.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    视觉增强的外观和祖先工具(ET)已被设计为将用于预测生物地理祖先的标记物加上一系列外部可见的特征组合到单个大规模平行测序(MPS)测定中。我们描述了ET中使用的祖先面板标记的发展,以及与以前基于MPS的法医血统分析相比,它们提供的增强分析。以及已确定的区分撒哈拉以南非洲的常染色体单核苷酸多态性(SNP),欧洲,东亚,南亚,美洲原住民,和大洋洲人口,ET包括能够有效区分来自中东地区的群体的常染色体SNP。ET常染色体祖先SNP能够将中东种群与其他连续定义的种群区分开来,因此可以使用结构在遗传聚类分析中识别出该区域的特征模式。联合集群成员估计表明,发现了表明北非或东非起源的个人共同祖先,或集群模式表明起源于中东的中部和东部地区。除了一组增强的常染色体SNP,ET包括85个Y-SNP的小组,16个X-SNP和21个常染色体微倍型。Y和X-SNP提供了一种独特的方法,可以获得有关背景混合的男性中发现的共同祖先模式的额外详细信息。这项研究使用了1000个基因组混合了非洲和混合的美国样本集,以充分探索这些增强功能,以分析个人的共同祖先。来自巴西城市和农村的样本与非洲的分布形成对比,欧洲,还研究了美洲原住民的共同血统,以评估为此目的合并Y和X-SNP数据的效率。选择纳入ET的小组微单倍型,因为它们在我们试图区分的七个群体中显示出最高水平的单倍型多样性。微单倍型数据未与单位点SNP基因型正式结合以分析血统。然而,用ET从这些基因座获得的单倍型序列读数创建了一个有效的系统,用于去卷积两个贡献者的混合DNA。我们进行了简单的混合实验,以证明当贡献者具有不同的祖先并且混合比例不平衡时(即,不是1:1的混合物)ET微单倍型小组是一个信息系统,可以在贡献者之间存在差异时推断祖先。
    The VISAGE Enhanced Tool for Appearance and Ancestry (ET) has been designed to combine markers for the prediction of bio-geographical ancestry plus a range of externally visible characteristics into a single massively parallel sequencing (MPS) assay. We describe the development of the ancestry panel markers used in ET, and the enhanced analyses they provide compared to previous MPS-based forensic ancestry assays. As well as established autosomal single nucleotide polymorphisms (SNPs) that differentiate sub-Saharan African, European, East Asian, South Asian, Native American, and Oceanian populations, ET includes autosomal SNPs able to efficiently differentiate populations from Middle East regions. The ability of the ET autosomal ancestry SNPs to distinguish Middle East populations from other continentally defined population groups is such that characteristic patterns for this region can be discerned in genetic cluster analysis using STRUCTURE. Joint cluster membership estimates showing individual co-ancestry that signals North African or East African origins were detected, or cluster patterns were seen that indicate origins from central and Eastern regions of the Middle East. In addition to an augmented panel of autosomal SNPs, ET includes panels of 85 Y-SNPs, 16 X-SNPs and 21 autosomal Microhaplotypes. The Y- and X-SNPs provide a distinct method for obtaining extra detail about co-ancestry patterns identified in males with admixed backgrounds. This study used the 1000 Genomes admixed African and admixed American sample sets to fully explore these enhancements to the analysis of individual co-ancestry. Samples from urban and rural Brazil with contrasting distributions of African, European, and Native American co-ancestry were also studied to gauge the efficiency of combining Y- and X-SNP data for this purpose. The small panel of Microhaplotypes incorporated in ET were selected because they showed the highest levels of haplotype diversity amongst the seven population groups we sought to differentiate. Microhaplotype data was not formally combined with single-site SNP genotypes to analyse ancestry. However, the haplotype sequence reads obtained with ET from these loci creates an effective system for de-convoluting two-contributor mixed DNA. We made simple mixture experiments to demonstrate that when the contributors have different ancestries and the mixture ratios are imbalanced (i.e., not 1:1 mixtures) the ET Microhaplotype panel is an informative system to infer ancestry when this differs between the contributors.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在巴西,高水平的农业活动反映在大量杀虫剂的消耗上。巴西2022年的粮食产量估计为2.898亿吨,较2021年增长14.7%。这些进步可能与人群对农药的职业暴露逐渐增加有关。对氧磷酶1基因(PON1)参与肝脏解毒;该基因的rs662变体修饰了该酶的活性。农药诱导的遗传损伤的修复取决于X射线修复交叉互补组1基因(XRCC)产生的蛋白质。其功能由于rs25487变体而受损。本研究描述了Goiás样本人群中rs662和rs25487的频率及其单倍型,巴西。它将频率与全球其他人群进行比较,以验证这些SNP分布的变化,在Goiás州有494个无关的个人。rs25487变体的A等位基因在Goiás人群中的频率为26%,修饰的rs662G等位基因频率为42.8%。记录了rs25487(G>A)和rs662(A>G)标记的四种单倍型,A-G单倍型(两个修饰的等位基因)的频率为11.9%,G-G单倍型为30.8%,A-A单倍型为14.3%,G-A单倍型(均为野生型等位基因)为42.8%。我们证明了在农业活动水平较高的地区,与农药暴露相关的重要SNP的分布。巴西中部。
    In Brazil, high levels of agricultural activity are reflected in the consumption of enormous amounts of pesticides. The production of grain in Brazil has been estimated at 289.8 million tons in the 2022 harvest, an expansion of 14.7% compared with 2021. These advances are likely associated with a progressive increase in the occupational exposure of a population to pesticides. The Paraoxonase 1 gene (PON1) is involved in liver detoxification; the rs662 variant of this gene modifies the activity of the enzyme. The repair of pesticide-induced genetic damage depends on the protein produced by the X-Ray Repair Cross-Complementing Group 1 gene (XRCC). Its function is impaired due to an rs25487 variant. The present study describes the frequencies of the rs662 and rs25487 and their haplotypes in a sample population from Goiás, Brazil. It compares the frequencies with other populations worldwide to verify the variation in the distribution of these SNPs, with 494 unrelated individuals in the state of Goiás. The A allele of the rs25487 variant had a frequency of 26% in the Goiás population, and the modified rs662 G allele had a frequency of 42.8%. Four haplotypes were recorded for the rs25487 (G > A) and rs662 (A > G) markers, with a frequency of 11.9% being recorded for the A-G haplotype (both modified alleles), 30.8% for the G-G haplotype, 14.3% for the A-A haplotype, and 42.8% for the G-A haplotype (both wild-type alleles). We demonstrated the distribution of important SNPs associated with pesticide exposure in an area with a high agricultural activity level, Central Brazil.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:群体变异分析对于收集人类基因型和表型之间联系的见解非常重要。1000基因组计划为人类遗传变异建立了有价值的参考;然而,不完全支持将相应数据与现有存储库和管道中的其他数据集集成使用。特别是,迫切需要基于其变体和元数据相关特征来灵活和快速地选择群体分区。
    结果:这里,我们针对一般种系或体细胞突变数据源,将其无缝包含在可互操作格式的存储库中,支持它们之间的整合以及与其他基因组数据的整合,以及它们在生物信息工作流程中的集成使用。此外,我们提供VarSum,数据汇总服务,用于使用群体元数据和/或变体特征的过滤器选择的感兴趣的子群体。该服务被开发为具有应用程序编程接口(API)的优化的计算框架,该API可以从任何现有的计算流水线或编程脚本中调用。提供了生物学兴趣的示例用例显示了相关性,API功能的强大功能和易用性。
    结论:提出的数据集成管道和数据集提取和摘要API为快速处理繁琐的变化数据的可靠计算基础设施铺平了道路,并允许生物学家和生物信息学家从越来越多可用的遗传变异研究中轻松地对用户定义的大型队列分区进行可扩展的分析。随着当前大规模(跨)全国性测序和变异计划的趋势,我们预计对本文提出的那种计算支持的需求会不断增长。
    BACKGROUND: Population variant analysis is of great importance for gathering insights into the links between human genotype and phenotype. The 1000 Genomes Project established a valuable reference for human genetic variation; however, the integrative use of the corresponding data with other datasets within existing repositories and pipelines is not fully supported. Particularly, there is a pressing need for flexible and fast selection of population partitions based on their variant and metadata-related characteristics.
    RESULTS: Here, we target general germline or somatic mutation data sources for their seamless inclusion within an interoperable-format repository, supporting integration among them and with other genomic data, as well as their integrated use within bioinformatic workflows. In addition, we provide VarSum, a data summarization service working on sub-populations of interest selected using filters on population metadata and/or variant characteristics. The service is developed as an optimized computational framework with an Application Programming Interface (API) that can be called from within any existing computing pipeline or programming script. Provided example use cases of biological interest show the relevance, power and ease of use of the API functionalities.
    CONCLUSIONS: The proposed data integration pipeline and data set extraction and summarization API pave the way for solid computational infrastructures that quickly process cumbersome variation data, and allow biologists and bioinformaticians to easily perform scalable analysis on user-defined partitions of large cohorts from increasingly available genetic variation studies. With the current tendency to large (cross)nation-wide sequencing and variation initiatives, we expect an ever growing need for the kind of computational support hereby proposed.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    为了编制一个新的南亚信息丰富的法医祖先SNP小组,我们改变了为此目的选择最有效标记的策略,方法是针对具有接近绝对特异性的多态性-当所有其他人群中都不存在所鉴定的南亚信息等位基因或以低于0.001(千分之一)的频率存在时.从1000个基因组数据集中识别出超过120个候选SNP,这些基因组数据集满足南亚人等位基因频率≥0.1(10%或更多)的等位基因频率筛选,在非洲≤0.001(0.1%或更少),东亚,和欧洲人口。从候选标记池中,最后一个由36个SNP组成的小组,广泛分布在大多数常染色体中,选择在五个1000个基因组南亚人群中的等位基因频率为0.4至0.15。平均等位基因频率略低,但是在用于验证1000个基因组变体注释的南亚gnomad数据集中观察到一致的信息模式。我们命名了36个南亚特异性SNPEurasiaplex-2的小组,并通过在四个基因组变异数据库中汇编4097个样本的全球人口数据来评估小组的信息量,这在很大程度上补充了1000个基因组的全球采样。等位基因频率分布的一致模式,这是南亚特有的,在所有人群中都观察到,或者密切关注,印度次大陆。来自HGDP-CEPH面板的巴基斯坦人口的等位基因频率明显较低,强调需要开发一个统计系统来评估统计个体中存在的群体特异性等位基因数量的祖先推断值。
    To compile a new South Asian-informative panel of forensic ancestry SNPs, we changed the strategy for selecting the most powerful markers for this purpose by targeting polymorphisms with near absolute specificity - when the South Asian-informative allele identified is absent from all other populations or present at frequencies below 0.001 (one in a thousand). More than 120 candidate SNPs were identified from 1000 Genomes datasets satisfying an allele frequency screen of ≥ 0.1 (10 % or more) allele frequency in South Asians, and ≤ 0.001 (0.1 % or less) in African, East Asian, and European populations. From the candidate pool of markers, a final panel of 36 SNPs, widely distributed across most autosomes, were selected that had allele frequencies in the five 1000 Genomes South Asian populations ranging from 0.4 to 0.15. Slightly lower average allele frequencies, but consistent patterns of informativeness were observed in gnomAD South Asian datasets used to validate the 1000 Genomes variant annotations. We named the panel of 36 South Asian-specific SNPs Eurasiaplex-2, and the informativeness of the panel was evaluated by compiling worldwide population data from 4097 samples in four genome variation databases that largely complement the global sampling of 1000 Genomes. Consistent patterns of allele frequency distribution, which were specific to South Asia, were observed in all populations in, or closely sited to, the Indian sub-continent. Pakistani populations from the HGDP-CEPH panel had markedly lower allele frequencies, highlighting the need to develop a statistical system to evaluate the ancestry inference value of counting the number of population-specific alleles present in an individual.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    结论:我们开发了PyLAE,一种使用全基因组测序数据或高密度基因分型实验确定基因组上局部血统的新工具。PyLAE可以处理任意数量的祖先种群(有或没有信息先验)。由于PyLAE不涉及估计许多参数,它可以在一天内处理数千个基因组。PyLAE可以在分阶段或非分阶段的基因组数据上运行。我们已经展示了PyLAE如何应用于群体之间差异富集途径的鉴定。与全基因组方法相比,局部祖先方法导致更高的富集分数。我们使用1000个基因组数据集对PyLAE进行基准测试,将汇总预测与全球混合结果和当前黄金标准程序RFMix进行比较。计算效率,数据预处理的最低要求,直接呈现结果,易于安装使PyLAE成为研究混合种群的有价值的工具。
    方法:源代码和安装手册可在https://github.com/smetam/pylae获得。
    CONCLUSIONS: We developed PyLAE, a new tool for determining local ancestry along a genome using whole-genome sequencing data or high-density genotyping experiments. PyLAE can process an arbitrarily large number of ancestral populations (with or without an informative prior). Since PyLAE does not involve estimating many parameters, it can process thousands of genomes within a day. PyLAE can run on phased or unphased genomic data. We have shown how PyLAE can be applied to the identification of differentially enriched pathways between populations. The local ancestry approach results in higher enrichment scores compared to whole-genome approaches. We benchmarked PyLAE using the 1000 Genomes dataset, comparing the aggregated predictions with the global admixture results and the current gold standard program RFMix. Computational efficiency, minimal requirements for data pre-processing, straightforward presentation of results, and ease of installation make PyLAE a valuable tool to study admixed populations.
    METHODS: The source code and installation manual are available at https://github.com/smetam/pylae.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    Microhaplotype loci (microhaplotype, MHs), defined by two or more closely linked single nucleotide polymorphisms, are a type of molecular marker within a short segment of DNA. As emerging forensic genetic markers, MHs have no stutter artefacts and higher polymorphism, and permit the design of smaller amplicons. In order to identify the markers from a genome wide perspective and explore their potential application further, we constructed the most comprehensive MH dataset to date, based on the whole genome sequencing data of 105 Han individuals in Southern China from 1000 Genomes Project. The results showed that there were 9,490,075 MH loci in the range of 350 bp in the human genome, and the distribution density of microhaplotypes suggests gene variation. Polymorphism analysis of MHs from various base spans showed that the polymorphism of MHs could reach or exceed common short tandem repeat sites. In addition, based on their flexible assembly, a scheme to build the public database of microhaplotypes was proposed.
    微单倍型(microhaplotype, MH)是在一定DNA片段范围之内,由至少两个单核苷酸多态性位点组成的遗传标记。MH兼具无stutter伪峰、多态性丰富以及扩增子较小等特点,有望成为法医学上的一种新型遗传标记。为了从全基因组维度上分析MH的特征,进一步发掘其应用潜能,本研究基于千人基因组计划中105个中国南方汉族个体的全基因组测序数据,构建了迄今为止最全面的MH数据集。结果表明,人类基因组中350 bp范围之内的MH位点数量共计9,490,075个,且微单倍型分布密度对染色体变异水平具有提示作用。从多种碱基跨度范围对MH的多态性分析表明,其多态性潜能可达到或者超过常用短串联重复序列位点的水平。此外,本文归纳总结了MH组装灵活等特点,并提出了构建微单倍型数据库的方案。.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    我们详细介绍了构成VISAGE基本工具(BT)一部分的祖先信息单核苷酸多态性(SNP)小组的发展,它在一个大规模平行测序(MPS)多重中结合了41个外观预测性SNP和112个祖先预测性SNP(在集合之间共享的三个SNP),而使用甲基化标记的基于血液的年龄分析是在平行的MPS分析管道中运行的。BT祖先小组的SNP选择集中在已建立的法医标记上,这些标记已经在MPS中具有良好的测序性能,整体SNP多重量表与现有的法医MPS检测方法非常匹配。选择SNP来区分来自非洲五个主要大陆人口群体的个体,欧洲,东亚,美国,大洋洲,延伸到包括来自南亚的个体的分化。通过对这六个群体的1000个基因组和HGDP-CEPH样本的分析,使用Snipper在线分析门户的Bayes似然计算器,BT祖先小组没有分类错误.尽可能平衡BT的组成祖先SNP的分化能力,以避免在具有混合背景的个体中估计共同祖先比例时出现偏差。平衡过程导致非洲非常相似的累积人口特定差异值,欧洲,美国,大洋洲,东亚略低于平均水平,南亚是其他群体的异类。对非洲人进行了比较,欧洲,和美洲原住民在六个混合的1000个基因组种群中估计的共同祖先比例,使用BT祖先面板SNP和572,000Affymetrix人类起源阵列SNP。观察到非常相似的共同祖先比例下降到10%的最小值,下面,BTSNP并不总是可靠地检测到低水平的共同血统。Snipper分析门户为BT祖先小组SNP提供了一个全面的人口数据集,包含520个样本的标准化参考数据集;来自1000个基因组的3445个额外样本,HGDP-CEPH,西蒙斯基金会和爱沙尼亚生物中心基因组多样性项目;以及来自中东个体内部基因分型的六个种群的167个样本,北非和东非区域补充了其他多样性项目的抽样制度。
    We detail the development of the ancestry informative single nucleotide polymorphisms (SNPs) panel forming part of the VISAGE Basic Tool (BT), which combines 41 appearance predictive SNPs and 112 ancestry predictive SNPs (three SNPs shared between sets) in one massively parallel sequencing (MPS) multiplex, whereas blood-based age analysis using methylation markers is run in a parallel MPS analysis pipeline. The selection of SNPs for the BT ancestry panel focused on established forensic markers that already have a proven track record of good sequencing performance in MPS, and the overall SNP multiplex scale closely matched that of existing forensic MPS assays. SNPs were chosen to differentiate individuals from the five main continental population groups of Africa, Europe, East Asia, America, and Oceania, extended to include differentiation of individuals from South Asia. From analysis of 1000 Genomes and HGDP-CEPH samples from these six population groups, the BT ancestry panel was shown to have no classification error using the Bayes likelihood calculators of the Snipper online analysis portal. The differentiation power of the component ancestry SNPs of BT was balanced as far as possible to avoid bias in the estimation of co-ancestry proportions in individuals with admixed backgrounds. The balancing process led to very similar cumulative population-specific divergence values for Africa, Europe, America, and Oceania, with East Asia being slightly below average, and South Asia an outlier from the other groups. Comparisons were made of the African, European, and Native American estimated co-ancestry proportions in the six admixed 1000 Genomes populations, using the BT ancestry panel SNPs and 572,000 Affymetrix Human Origins array SNPs. Very similar co-ancestry proportions were observed down to a minimum value of 10%, below which, low-level co-ancestry was not always reliably detected by BT SNPs. The Snipper analysis portal provides a comprehensive population dataset for the BT ancestry panel SNPs, comprising a 520-sample standardised reference dataset; 3445 additional samples from 1000 Genomes, HGDP-CEPH, Simons Foundation and Estonian Biocentre genome diversity projects; and 167 samples of six populations from in-house genotyping of individuals from Middle East, North and East African regions complementing those of the sampling regimes of the other diversity projects.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    The aim of this study is to analyze the worldwide distribution of SNP rs4870723 in COL14A1 gene to check if there are significant genetic differences among different populations and to test if the gene is a trait under selection.
    Genomic DNA was extracted from 69 unrelated individuals from Sardinia and genotyped for SNP rs4870723. Data were compared with 26 different populations, clustered in 5 super-populations, from the public 1000 genomes database. Allele frequency and heterozygosity were calculated with Genepop. The Hardy-Weinberg equilibrium and pairwise population differentiation through analysis of molecular variance (AMOVA FST) were determined with Arlequin.
    Allele frequencies of COL14A1 rs4870723 were compared in 27 populations clustered in 5 super-populations. All populations were in the Hardy-Weinberg equilibrium. In almost all populations, allele C was the most frequent allele, reaching the highest values in East Asia. The 27 populations showed an appreciable structure, with significant differences observed between European, African, and Asian populations.
    Significant differences were observed in the rs4870723 SNP distribution among the populations studied. However, we found no evidence for a selective pressure. Rather, the differentiation among the populations is likely the result of founder effect, genetic drift, and cultural factors, all events known to establish and maintain genetic diversity between populations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号