low-frequency variants

低频变种
  • 文章类型: Journal Article
    目的:可以进行全基因组关联研究(GWAS)荟萃分析,包括大量来自芬兰北部创始人的绝经前年轻女性样本,确定循环抗苗勒管激素(AMH)水平的新遗传变异,并提供对AMH调节所涉及的不同生物学途径和组织中单核苷酸多态性富集的见解?
    结论:荟萃分析确定了总共六个与P<5×10-8时的AMH水平相关的基因座,其中三个在CHEP2,BMP4和重要的EIF4renal富集系统和脑垂体作为组织富集分析中的顶部相关组织。
    背景:AMH在女性的窦前和小窦期卵巢卵泡中表达,年龄特异性循环AMH水平的变化与几种健康状况有关。然而,目前尚不完全了解健康状况与AMH水平之间关联的生物学机制.以前的GWAS已经确定了与绝经前妇女AMH水平相关的基因座,在或接近MCM8,AMH,TEX41和CDCA7。
    方法:我们对9668名绝经前妇女的循环AMH水平测量进行了GWAS荟萃分析。
    方法:我们进行了GWAS荟萃分析,其中我们将来自前瞻性创始人人群队列(北芬兰出生队列1966,NFBC1966)的2619个AMH测量值(年龄31岁)与先前的GWAS荟萃分析相结合,该分析包括7049名绝经前妇女(年龄范围15-48岁)(N=9668)。使用自动化测定对NFBC1966AMH测量进行定量。我们注释了遗传变异,组合不同的数据层以优先考虑潜在的候选基因,描述了由GWAS信号富集的重要途径和组织,使用共定位分析确定了合理的调节作用,并利用公开可用的汇总统计数据来评估与多个性状的遗传和表型相关性。
    结果:确定了三个新的全基因组显著位点。其中之一是与CHEK2中的c.1100delC完全连锁不平衡,与其他欧洲人口相比,芬兰人口中的c.1100delC富集了4倍。我们提出了一些与AMH相关的GWAS变体的合理调节作用,它们与与BMP4、TEX41和EIFBP41基因表达水平相关的GWAS信号共定位。基因集分析强调了肾脏系统血管形态发生的显着富集,组织富集分析将脑垂体列为最高关联。
    方法:GWAS荟萃分析汇总统计数据可从GWAS目录下载,登录号为GCST90428625。
    结论:这项研究仅包括欧洲血统的女性,基因表达数据中缺乏足够大小的相关组织数据阻碍了对生殖组织潜在调节作用的评估。
    结论:我们的结果强调了创始人群体的力量增加和更大的样本量,以促进发现AMH水平变异的新性状相关变异。这有助于表征GWAS信号在不同生物学途径中的富集以及与AMH水平变化相关的合理遗传调节作用。
    背景:这项工作已根据MATERMarieSklodowska-CurieGrant协议获得了欧盟“地平线2020研究与创新计划”的资助。813707和奥卢大学奖学金基金会和PaulonSäätiö基金会。(N.P.-G.),芬兰学院,SigridJusélius基金会,诺和诺德,奥卢大学,罗氏诊断(T.T.P.)。这项工作得到了爱沙尼亚研究理事会赠款1911(R.M.)的支持。J.R.根据第1号赠款协议,得到了欧盟“地平线2020研究与创新计划”的支持。874739(LongITools),824989(EUCAN-Connect),848158(早期原因),和733206(生命周期)。U.V.由爱沙尼亚研究理事会资助PRG(PRG1291)支持。NFBC1966获得了奥卢大学的财政支持。24000692,奥卢大学医院批准号24301140和ERDF欧洲区域发展基金赠款编号。539/2010A31592.T.T.P.收到了罗氏的资助,PerkinElmer,和GedeonRichter的科学演讲的酬金,Exeltis,Astellas,罗氏,Stragen,AstraZeneca,默克,MSD,套圈,多底膜,和AjatonTerveys.对于所有其他作者,没有竞争的利益。
    OBJECTIVE: Can a genome-wide association study (GWAS) meta-analysis, including a large sample of young premenopausal women from a founder population from Northern Finland, identify novel genetic variants for circulating anti-Müllerian hormone (AMH) levels and provide insights into single-nucleotide polymorphism enrichment in different biological pathways and tissues involved in AMH regulation?
    CONCLUSIONS: The meta-analysis identified a total of six loci associated with AMH levels at P < 5 × 10-8, three of which were novel in or near CHEK2, BMP4, and EIF4EBP1, as well as highlighted significant enrichment in renal system vasculature morphogenesis, and the pituitary gland as the top associated tissue in tissue enrichment analysis.
    BACKGROUND: AMH is expressed by preantral and small antral stage ovarian follicles in women, and variation in age-specific circulating AMH levels has been associated with several health conditions. However, the biological mechanisms underlying the association between health conditions and AMH levels are not yet fully understood. Previous GWAS have identified loci associated with AMH levels in pre-menopausal women, in or near MCM8, AMH, TEX41, and CDCA7.
    METHODS: We performed a GWAS meta-analysis for circulating AMH level measurements in 9668 pre-menopausal women.
    METHODS: We performed a GWAS meta-analysis in which we combined 2619 AMH measurements (at age 31 years) from a prospective founder population cohort (Northern Finland Birth Cohort 1966, NFBC1966) with a previous GWAS meta-analysis that included 7049 pre-menopausal women (age range 15-48 years) (N = 9668). NFBC1966 AMH measurements were quantified using an automated assay. We annotated the genetic variants, combined different data layers to prioritize potential candidate genes, described significant pathways and tissues enriched by the GWAS signals, identified plausible regulatory roles using colocalization analysis, and leveraged publicly available summary statistics to assess genetic and phenotypic correlations with multiple traits.
    RESULTS: Three novel genome-wide significant loci were identified. One of these is in complete linkage disequilibrium with c.1100delC in CHEK2, which is found to be 4-fold enriched in the Finnish population compared to other European populations. We propose a plausible regulatory effect of some of the GWAS variants linked to AMH, as they colocalize with GWAS signals associated with gene expression levels of BMP4, TEX41, and EIFBP41. Gene set analysis highlighted significant enrichment in renal system vasculature morphogenesis, and tissue enrichment analysis ranked the pituitary gland as the top association.
    METHODS: The GWAS meta-analysis summary statistics are available for download from the GWAS Catalogue with accession number GCST90428625.
    CONCLUSIONS: This study only included women of European ancestry and the lack of sufficiently sized relevant tissue data in gene expression datasets hinders the assessment of potential regulatory effects in reproductive tissues.
    CONCLUSIONS: Our results highlight the increased power of founder populations and larger sample sizes to boost the discovery of novel trait-associated variants underlying variation in AMH levels, which aided the characterization of GWAS signals enrichment in different biological pathways and plausible genetic regulatory effects linked with AMH level variation for the first time.
    BACKGROUND: This work has received funding from the European Union\'s Horizon 2020 Research and Innovation Programme under the MATER Marie Sklodowska-Curie Grant Agreement No. 813707 and Oulu University Scholarship Foundation and Paulon Säätiö Foundation. (N.P.-G.), Academy of Finland, Sigrid Jusélius Foundation, Novo Nordisk, University of Oulu, Roche Diagnostics (T.T.P.). This work was supported by the Estonian Research Council Grant 1911 (R.M.). J.R. was supported by the European Union\'s Horizon 2020 Research and Innovation Program under Grant Agreements No. 874739 (LongITools), 824989 (EUCAN-Connect), 848158 (EarlyCause), and 733206 (LifeCycle). U.V. was supported by the Estonian Research Council grant PRG (PRG1291). The NFBC1966 received financial support from University of Oulu Grant No. 24000692, Oulu University Hospital Grant No. 24301140, and ERDF European Regional Development Fund Grant No. 539/2010 A31592. T.T.P. has received grants from Roche, Perkin Elmer, and honoraria for scientific presentations from Gedeon Richter, Exeltis, Astellas, Roche, Stragen, Astra Zeneca, Merck, MSD, Ferring, Duodecim, and Ajaton Terveys. For all other authors, there are no competing interests.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    识别真正的多态变异是序列数据分析中的一个重大挑战,尽管在序列数据中检测低频变异对于估计人口统计学参数和调查遗传过程至关重要,比如选择,在人口内。丛枝菌根(AM)真菌是多核生物,其中单个细胞核作为一个群体集体运作,而跨细胞核的遗传变异程度长期以来一直是科学关注的领域。在这项研究中,我们通过比较AM真菌模型物种的两个不同基因组序列数据集中的多态性发现,研究了多态性发现的模式和替代等位基因频率分布。红藻菌株DAOM197198。本研究中使用的2个数据集是公开可用的,是从合并的孢子和菌丝或从单个孢子扩增的单个核产生的。我们还估计了DAOM197198菌株内的生物体内变化。我们的结果表明,这2个数据集对发现的变体表现出不同的频率模式。整个有机体的数据集显示了一个跨越低,中介-,和高频变体,而单核数据集主要以低频变异为特征,在中频和高频中比例较小。此外,整个生物体和单个细胞核内的单核苷酸多态性密度估计证实了DAOM197198菌株的低生物体内变异,并且大多数变体很少见。我们的研究强调了与在AM真菌全基因组序列数据中检测低频变异相关的方法学挑战,并证明了可以在AM真菌的单个核中可靠地鉴定出替代等位基因。
    Identifying genuine polymorphic variants is a significant challenge in sequence data analysis, although detecting low-frequency variants in sequence data is essential for estimating demographic parameters and investigating genetic processes, such as selection, within populations. Arbuscular mycorrhizal (AM) fungi are multinucleate organisms, in which individual nuclei collectively operate as a population, and the extent of genetic variation across nuclei has long been an area of scientific interest. In this study, we investigated the patterns of polymorphism discovery and the alternate allele frequency distribution by comparing polymorphism discovery in 2 distinct genomic sequence datasets of the AM fungus model species, Rhizophagus irregularis strain DAOM197198. The 2 datasets used in this study are publicly available and were generated either from pooled spores and hyphae or amplified single nuclei from a single spore. We also estimated the intraorganismal variation within the DAOM197198 strain. Our results showed that the 2 datasets exhibited different frequency patterns for discovered variants. The whole-organism dataset showed a distribution spanning low-, intermediate-, and high-frequency variants, whereas the single-nucleus dataset predominantly featured low-frequency variants with smaller proportions in intermediate and high frequencies. Furthermore, single nucleotide polymorphism density estimates within both the whole organism and individual nuclei confirmed the low intraorganismal variation of the DAOM197198 strain and that most variants are rare. Our study highlights the methodological challenges associated with detecting low-frequency variants in AM fungal whole-genome sequence data and demonstrates that alternate alleles can be reliably identified in single nuclei of AM fungi.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    解开基因表达变异的遗传来源对于更好地理解自然种群中表型多样性的起源至关重要。全基因组关联研究发现了数千种与基因表达变异有关的变异,然而,检测到的变异只解释了部分遗传力。事实上,在相关性研究中,低频和结构变异(SV)等变异体的捕获效果不佳.为了评估这些变异对基因表达变异的影响,我们探索了由323个杂种组成的半Diallel小组,这些杂种来自26个天然酿酒酵母分离株的成对杂交。使用短阅读和长阅读测序策略,我们为该组建立了单核苷酸多态性(SNP)和SV的详尽目录。将这个数据集和所有杂种的转录组结合起来,我们全面绘制了与基因表达变异相关的SNP和SV。而SVs影响基因表达变异,与常见的SNP相比,SNP表现出更高的效应大小,低频变体的过度表达。这些结果加强了在种群水平上通过全面的遗传变异目录来剖析复杂性状的遗传力的重要性。
    Unraveling the genetic sources of gene expression variation is essential to better understand the origins of phenotypic diversity in natural populations. Genome-wide association studies identified thousands of variants involved in gene expression variation, however, variants detected only explain part of the heritability. In fact, variants such as low-frequency and structural variants (SVs) are poorly captured in association studies. To assess the impact of these variants on gene expression variation, we explored a half-diallel panel composed of 323 hybrids originated from pairwise crosses of 26 natural Saccharomyces cerevisiae isolates. Using short- and long-read sequencing strategies, we established an exhaustive catalog of single nucleotide polymorphisms (SNPs) and SVs for this panel. Combining this dataset with the transcriptomes of all hybrids, we comprehensively mapped SNPs and SVs associated with gene expression variation. While SVs impact gene expression variation, SNPs exhibit a higher effect size with an overrepresentation of low-frequency variants compared to common ones. These results reinforce the importance of dissecting the heritability of complex traits with a comprehensive catalog of genetic variants at the population level.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在具有无法解释的静脉血栓栓塞症(VTE)倾向的家庭中,全外显子组测序(WES)可能有利于检测已知有助于止血或与VTE相关表型相关的基因中的低频变异。六个家庭成员的WES分析,其中三人受到记录在案的VTE的影响,在192个候选基因中过滤MAF<0.04,在患者中发现了22种杂合(16种错义和6种同义)变异。通过多组分生物信息学工具进行功能预测,通过数据库/文献检索实现,包括ClinVar注释和QTL分析,优先考虑12个错觉变体,其中三个(CRPLeu61Pro,F2Asn514Lys和NQO1Arg139Trp)存在于所有患者中,以及常见的功能变体FGBArg478Lys和IL1AAla114Ser。每位患者的优先变体的组合用于推断功能性蛋白质相互作用。不同的互动模式,在高质量证据的支持下,包括在“急性期”交织在一起的八种蛋白质(CRP,F2、SERPINA1和IL1A)和/或“纤维蛋白原复合物”(CRP,F2,PLAT,THBS1,VWF和FGB)显着丰富了术语。在一组广泛的候选基因中,这种方法突出了六种低频变体(CRPLeu61Pro,F2Asn514Lys,SERPINA1Arg63Cys,THBS1Asp901Glu,VWFArg1399His和PLATArg164Trp),其中五个在预测的有害性方面排名最高,不同的组合可能导致该家族成员的疾病易感性。
    Whole-exome sequencing (WES) in families with an unexplained tendency for venous thromboembolism (VTE) may favor detection of low-frequency variants in genes with known contribution to hemostasis or associated with VTE-related phenotypes. WES analysis in six family members, three of whom affected by documented VTE, filtered for MAF < 0.04 in 192 candidate genes, revealed 22 heterozygous (16 missense and six synonymous) variants in patients. Functional prediction by multi-component bioinformatics tools, implemented by a database/literature search, including ClinVar annotation and QTL analysis, prioritized 12 missense variants, three of which (CRP Leu61Pro, F2 Asn514Lys and NQO1 Arg139Trp) were present in all patients, and the frequent functional variants FGB Arg478Lys and IL1A Ala114Ser. Combinations of prioritized variants in each patient were used to infer functional protein interactions. Different interaction patterns, supported by high-quality evidence, included eight proteins intertwined in the \"acute phase\" (CRP, F2, SERPINA1 and IL1A) and/or in the \"fibrinogen complex\" (CRP, F2, PLAT, THBS1, VWF and FGB) significantly enriched terms. In a wide group of candidate genes, this approach highlighted six low-frequency variants (CRP Leu61Pro, F2 Asn514Lys, SERPINA1 Arg63Cys, THBS1 Asp901Glu, VWF Arg1399His and PLAT Arg164Trp), five of which were top ranked for predicted deleteriousness, which in different combinations may contribute to disease susceptibility in members of this family.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在没有治疗前和最近的HIV诊断史的受试者中,与耐药性相关的低频突变与病毒学失败有关。总的来说,选择了78名抗逆转录病毒治疗(ART)初治的受试者,并随后进行CD4T淋巴细胞和病毒载量测试以检测病毒学失败。我们使用下一代测序(NGS)对基础样本进行了回顾性测序,寻找使用Sanger测序方法(SSM)之前未检测到的低频突变,并描述对ART的反应。22名受试者发展为病毒学失败(VF),其中13例至少有一个与逆转录酶抑制剂(RTI)和蛋白酶抑制剂(PIs)相关的耐药突变,频率≤1%,以前在他们的基础基因分型测试中未检测到。未观察到整合酶链转移抑制剂(INSTIs)的抗性突变。我们确定了在检测到低频突变的ART初治受试者中VF的可能原因。据我们所知,这是在该国首次通过使用NGS分析HIV-1pol基因,对未接受ART治疗的HIV/AIDS感染者(PLWHA)进行的HIV-1少数群体变异体预先存在的耐药性评估.
    Low-frequency mutations associated with drug resistance have been related to virologic failure in subjects with no history of pre-treatment and recent HIV diagnosis. In total, 78 antiretroviral treatment (ART)-naïve subjects with a recent HIV diagnosis were selected and followed by CD4+ T lymphocytes and viral load tests to detect virologic failure. We sequenced the basal samples retrospectively using next-generation sequencing (NGS), looking for low-frequency mutations that had not been detected before using the Sanger sequencing method (SSM) and describing the response to ART. Twenty-two subjects developed virologic failure (VF), and thirteen of them had at least one drug-resistance mutation associated with Reverse Transcriptase Inhibitors (RTI) and Protease Inhibitors (PIs) at frequency levels ≤ 1%, not detected previously in their basal genotyping test. No resistance mutations were observed to Integrase Strand Transfer Inhibitors (INSTIs). We identified a possible cause of VF in ART-naïve subjects with low-frequency mutations detected. To our knowledge, this is the first evaluation of pre-existing drug resistance for HIV-1 minority variants carried out on ART-naïve people living with HIV/AIDS (PLWHA) by analyzing the HIV-1 pol gene using NGS in the country.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Preprint
    解开基因表达变异的遗传来源对于更好地理解自然种群中表型多样性的起源至关重要。全基因组关联研究发现了数千种与基因表达变异有关的变异,然而,检测到的变异只解释了部分遗传力。事实上,在相关性研究中,低频和结构变异(SV)等变异体的捕获效果不佳.为了评估这些变异对基因表达变异的影响,我们探索了由323个杂种组成的半Diallel小组,这些杂种来自26个天然酿酒酵母分离株的成对杂交。使用短阅读和长阅读测序策略,我们为该组建立了单核苷酸多态性(SNP)和SV的详尽目录。将这个数据集和所有杂种的转录组结合起来,我们全面绘制了与基因表达变异相关的SNP和SV。而SVs影响基因表达变异,与常见的SNP相比,SNP表现出更高的效应大小,低频变体的过度表达。这些结果加强了在种群水平上通过全面的遗传变异目录来剖析复杂性状的遗传力的重要性。
    Unraveling the genetic sources of gene expression variation is essential to better understand the origins of phenotypic diversity in natural populations. Genome-wide association studies identified thousands of variants involved in gene expression variation, however, variants detected only explain part of the heritability. In fact, variants such as low-frequency and structural variants (SVs) are poorly captured in association studies. To assess the impact of these variants on gene expression variation, we explored a half-diallel panel composed of 323 hybrids originated from pairwise crosses of 26 natural Saccharomyces cerevisiae isolates. Using short- and long-read sequencing strategies, we established an exhaustive catalog of single nucleotide polymorphisms (SNPs) and SVs for this panel. Combining this dataset with the transcriptomes of all hybrids, we comprehensively mapped SNPs and SVs associated with gene expression variation. While SVs impact gene expression variation, SNPs exhibit a higher effect size with an overrepresentation of low-frequency variants compared to common ones. These results reinforce the importance of dissecting the heritability of complex traits with a comprehensive catalog of genetic variants at the population level.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    流感病毒在宿主之间表现出相当大的多样性。此外,在同一宿主中可以发现不同的准种。高通量测序技术可用于在足够的深度对患者来源的病毒群体进行测序,以识别准种中存在的低频变异(LFV)。但是由于在样品制备和测序过程中引入的实验误差,对于可靠的LFV检测仍然存在许多挑战。高基因组拷贝数和广泛的测序深度需要区分假阳性和真实的LFV,特别是在低等位基因频率(AFs)。这项研究提出了一种在常规监测期间获得的患者来源样品中鉴定LFV的一般方法。首先,确定了LFV检测的验证阈值,同时平衡临床样本中可靠的LFV检测的成本和可行性。使用基因定义明确的甲型流感病毒群体,建立了至少104个基因组/微升的阈值和≥5%的AF作为检测限.其次,纳入了来自2016-2017年比利时流感季节的59个保留的甲型流感(H3N2)样本的子集.第三,作为常规流感监测的LFV附加值的概念证明,研究了患者数据和全基因组测序数据之间的潜在关联.发现LFV的高患病率与疾病严重程度之间存在显着关联。这项研究提供了流感LFV检测的一般方法,其他国家流感参考中心和SARS-CoV-2等其他病毒也可以采用。此外,这项研究表明,目前LFV与常规流感监测计划的相关性可能被低估.
    Influenza viruses exhibit considerable diversity between hosts. Additionally, different quasispecies can be found within the same host. High-throughput sequencing technologies can be used to sequence a patient-derived virus population at sufficient depths to identify low-frequency variants (LFV) present in a quasispecies, but many challenges remain for reliable LFV detection because of experimental errors introduced during sample preparation and sequencing. High genomic copy numbers and extensive sequencing depths are required to differentiate false positive from real LFV, especially at low allelic frequencies (AFs). This study proposes a general approach for identifying LFV in patient-derived samples obtained during routine surveillance. Firstly, validated thresholds were determined for LFV detection, whilst balancing both the cost and feasibility of reliable LFV detection in clinical samples. Using a genetically well-defined population of influenza A viruses, thresholds of at least 104 genomes per microlitre and AF of ≥5 % were established as detection limits. Secondly, a subset of 59 retained influenza A (H3N2) samples from the 2016-2017 Belgian influenza season was composed. Thirdly, as a proof of concept for the added value of LFV for routine influenza monitoring, potential associations between patient data and whole genome sequencing data were investigated. A significant association was found between a high prevalence of LFV and disease severity. This study provides a general methodology for influenza LFV detection, which can also be adopted by other national influenza reference centres and for other viruses such as SARS-CoV-2. Additionally, this study suggests that the current relevance of LFV for routine influenza surveillance programmes might be undervalued.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    通过下一代测序(NGS)对人类基因组进行重新测序已被广泛应用于发现致病遗传变异和/或导致包括癌症在内的各种类型疾病的致病基因。NGS的进展允许在合理的时间范围和成本内对患者的整个基因组进行测序并鉴定疾病相关变体。变体识别的核心依赖于准确的变体调用和注释。已经开发了许多算法来阐明体细胞和种系变体的库。每种算法都有自己独特的优势,弱点,以及由于采用的统计建模方法和利用的读取信息的差异而造成的局限性。由于存在测序伪像和读段未对齐,准确的变体调用仍然具有挑战性。所有这些都可能导致变体调用结果的不一致,甚至对发现的误解。对于体细胞变异检测,包括染色体异常在内的多种因素,肿瘤异质性,肿瘤-正常交叉污染,不平衡的肿瘤/正常样本覆盖率,和具有低等位基因频率的变体为准确的变体鉴定增加了更多的复杂性。鉴于不和谐和困难,通过协调来自不同算法的信息来提高变体调用性能,已经出现了集成方法。在这一章中,我们首先介绍变体调用算法的一般方案和不同阶段的潜在挑战。接下来,我们将回顾现有的变体调用和注释工作流程,最后探索不同来电者部署的策略以及他们的优势和注意事项。总的来说,仔细考虑的基于NGS的变体鉴定允许可靠地检测致病性变体和候选变体选择用于精准医学。
    Re-sequencing of the human genome by next-generation sequencing (NGS) has been widely applied to discover pathogenic genetic variants and/or causative genes accounting for various types of diseases including cancers. The advances in NGS have allowed the sequencing of the entire genome of patients and identification of disease-associated variants in a reasonable timeframe and cost. The core of the variant identification relies on accurate variant calling and annotation. Numerous algorithms have been developed to elucidate the repertoire of somatic and germline variants. Each algorithm has its own distinct strengths, weaknesses, and limitations due to the difference in the statistical modeling approach adopted and read information utilized. Accurate variant calling remains challenging due to the presence of sequencing artifacts and read misalignments. All of these can lead to the discordance of the variant calling results and even misinterpretation of the discovery. For somatic variant detection, multiple factors including chromosomal abnormalities, tumor heterogeneity, tumor-normal cross contaminations, unbalanced tumor/normal sample coverage, and variants with low allele frequencies add even more layers of complexity to accurate variant identification. Given the discordances and difficulties, ensemble approaches have emerged by harmonizing information from different algorithms to improve variant calling performance. In this chapter, we first introduce the general scheme of variant calling algorithms and potential challenges at distinct stages. We next review the existing workflows of variant calling and annotation, and finally explore the strategies deployed by different callers as well as their strengths and caveats. Overall, NGS-based variant identification with careful consideration allows reliable detection of pathogenic variant and candidate variant selection for precision medicine.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    多发性硬化(MS)是一种复杂的多因素自身免疫性疾病,其性别和年龄调整后的患病率在撒丁岛(意大利)是世界上最高的。迄今为止,233个基因座与MS相关,几乎20%的风险遗传力归因于常见的遗传变异,但许多低频和罕见的变异仍有待发现。这里,我们旨在通过研究潜在的功能性罕见变异,为理解MS的遗传基础做出贡献.为此,我们用免疫芯片基因分型数据分析了13个多重撒丁岛家族。对于五个家庭来说,也可获得全外显子组测序(WES)数据。首先,我们进行了非参数纯合性单倍型分析,以鉴定来自共同祖先(RCA)的区域。然后,这些潜在的疾病相关的RCA,我们通过分析WES数据,搜索了受影响个体共有的罕见变异的存在.我们发现:(i)CUL9外显子27的剪接区中的变体(43181034T>G);(ii)ATP9A外显子16的剪接区中的变体(50245517A>C);(iii)非同义变体(43223539A>C),在TTBK1的外显子9上;(iv)在PPP2R5D的外显子9上的非同义变体(42976917A>C);和v)在MYO16的3'UTR中的变体(109859349-109859354)。
    Multiple Sclerosis (MS) is a complex multifactorial autoimmune disease, whose sex- and age-adjusted prevalence in Sardinia (Italy) is among the highest worldwide. To date, 233 loci were associated with MS and almost 20% of risk heritability is attributable to common genetic variants, but many low-frequency and rare variants remain to be discovered. Here, we aimed to contribute to the understanding of the genetic basis of MS by investigating potentially functional rare variants. To this end, we analyzed thirteen multiplex Sardinian families with Immunochip genotyping data. For five families, Whole Exome Sequencing (WES) data were also available. Firstly, we performed a non-parametric Homozygosity Haplotype analysis for identifying the Region from Common Ancestor (RCA). Then, on these potential disease-linked RCA, we searched for the presence of rare variants shared by the affected individuals by analyzing WES data. We found: (i) a variant (43181034 T > G) in the splicing region on exon 27 of CUL9; (ii) a variant (50245517 A > C) in the splicing region on exon 16 of ATP9A; (iii) a non-synonymous variant (43223539 A > C), on exon 9 of TTBK1; (iv) a non-synonymous variant (42976917 A > C) on exon 9 of PPP2R5D; and v) a variant (109859349-109859354) in 3\'UTR of MYO16.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    Objective: To analyze the association between low-frequency variants of ARID1A gene and primary liver cancer using latent category model. Methods: The low-frequency variants of ARID1A gene was combined according to different functional areas, and the combined variables were analyzed by using the latent class model to obtain the latent variables. Then the logistic regression was used to analyze the association between low-frequency variants of ARID1A gene and primary liver cancer. Results: The low-frequency variants of ARID1A gene were divided into three categories by the latent class model. The class 1 was mainly unmutated population, the proportion was 94.2% (2 454/2 603). The class 2 was mainly transcriptional regulatory domain mutation, take 4.8% (124/2 603). The class 3 was dominantly exon mutation, about 1.0% (27/2 603). Using class 1 as a reference, it was found that mutations in the transcriptional regulatory domain could reduce the risk of liver cancer (OR=0.601, 95% CI=0.364-0.992, P=0.046). Conclusion: The latent class model can identify low-frequency variants of gene associated with liver cancer and can be extended to more genetic association studies of low-frequency variants related to complex diseases.
    目的: 应用潜在类别模型分析ARID1A基因低频变异与原发性肝癌的关系。 方法: 根据基因的不同功能区域将ARID1A基因低频变异进行合并,应用潜在类别模型分析合并后的ARID1A基因低频变异得到分类潜变量,采用logistic回归分析ARID1A基因低频变异与肝癌发生之间的关系。 结果: 潜在类别模型将ARID1A基因低频变异人群分为3类,类别1主要为基因未突变人群,占比94.2%(2 452/2 603);类别2主要为基因转录调控功能区突变人群,占比4.8%(124/2 603);类别3主要为基因外显子突变人群,占比1.0%(27/2 603)。以类别1人群作为参照,转录调控功能区基因突变可降低肝癌的发生风险(OR为0.601,95% CI为0.364~0.992, P=0.046)。 结论: 潜在类别模型可识别肝癌相关基因低频变异,潜在类别模型可推广应用于更多复杂疾病相关低频变异的遗传关联研究。.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号