structural variants

结构变体
  • 文章类型: Journal Article
    ABCA4是导致遗传性视网膜疾病(IRD)的最常见突变基因,迄今已报道超过2200种致病变体。其中,〜1%是涉及基因组区域缺失或重复的拷贝数变异(CNV),通常长度>50个核苷酸。基于公共数据库LOVD对当前文献进行了深入评估,关于ABCA4中已知CNVs和结构变异体的存在,以及使用单分子分子倒置探针(smMIPs)对148个先证者进行ABCA4的额外测序分析,突出显示了与ABCA4相关的视网膜病变相关的复发性和新型CNVs.对测序数据中的覆盖深度的分析导致鉴定出11个缺失(6个新缺失和5个复发缺失)。三个重复(一个新的和两个复发)和一个复杂的CNV。特别感兴趣的是复杂缺陷的识别,即,包含外显子31至内含子41的15.3kb重复片段,插入到包含内含子44至内含子47的下游2.7kb缺失的连接处。此外,我们在三例病例中发现了内含子1的7.0kb串联重复。ABCA4中CNV的鉴定可以为患者及其家人提供遗传诊断,同时扩大我们对ABCA4变异引起的疾病复杂性的理解。
    ABCA4 is the most frequently mutated gene leading to inherited retinal disease (IRD) with over 2200 pathogenic variants reported to date. Of these, ~1% are copy number variants (CNVs) involving the deletion or duplication of genomic regions, typically >50 nucleotides in length. An in-depth assessment of the current literature based on the public database LOVD, regarding the presence of known CNVs and structural variants in ABCA4, and additional sequencing analysis of ABCA4 using single-molecule Molecular Inversion Probes (smMIPs) for 148 probands highlighted recurrent and novel CNVs associated with ABCA4-associated retinopathies. An analysis of the coverage depth in the sequencing data led to the identification of eleven deletions (six novel and five recurrent), three duplications (one novel and two recurrent) and one complex CNV. Of particular interest was the identification of a complex defect, i.e., a 15.3 kb duplicated segment encompassing exon 31 through intron 41 that was inserted at the junction of a downstream 2.7 kb deletion encompassing intron 44 through intron 47. In addition, we identified a 7.0 kb tandem duplication of intron 1 in three cases. The identification of CNVs in ABCA4 can provide patients and their families with a genetic diagnosis whilst expanding our understanding of the complexity of diseases caused by ABCA4 variants.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    随着最近人类基因组中结构变异鉴定的扩展,了解这些有影响的变异在疾病结构中的作用至关重要.目前,大比例的全基因组显著的全基因组关联研究(GWAS)单核苷酸多态性(SNPs)在功能上尚未解决,增加了这些SNP中的一些通过与因果结构变异的连锁不平衡与疾病相关的可能性。因此,了解新发现的结构变异和统计学意义的SNP之间的连锁不平衡,可能为进一步研究基因组中疾病相关区域提供资源.在这里,我们提出了一种资源分类,以高度连锁不平衡的方式对结构变体-重要的SNP对。该数据库由(i)已表现出与性状的全基因组显着关联的SNP组成,主要是疾病表型,(ii)新发布的结构变体(SV),和(iii)根据非分阶段数据计算的连锁不平衡值。所有数据文件,包括详细说明SV和GWASSNP关联和GWAS-SNP-SV对结果的数据文件,可在SV-SNPLD数据库中获得,可在\'https://github.com/hliang-SchrodiLab/SV_SNP上访问。我们的分析结果代表了一种有用的精细作图工具,可用于询问与疾病相关SNP连锁不平衡的SV。我们预计该资源可能在随后的研究中发挥重要作用,这些研究将引起疾病的SVs纳入疾病风险预测模型。
    With the recent expansion of structural variant identification in the human genome, understanding the role of these impactful variants in disease architecture is critically important. Currently, a large proportion of genome-wide-significant genome-wide association study (GWAS) single nucleotide polymorphisms (SNPs) are functionally unresolved, raising the possibility that some of these SNPs are associated with disease through linkage disequilibrium with causal structural variants. Hence, understanding the linkage disequilibrium between newly discovered structural variants and statistically significant SNPs may provide a resource for further investigation into disease-associated regions in the genome. Here we present a resource cataloging structural variant-significant SNP pairs in high linkage disequilibrium. The database is composed of (i) SNPs that have exhibited genome-wide significant association with traits, primarily disease phenotypes, (ii) newly released structural variants (SVs), and (iii) linkage disequilibrium values calculated from unphased data. All data files including those detailing SV and GWAS SNP associations and results of GWAS-SNP-SV pairs are available at the SV-SNP LD Database and can be accessed at \'https://github.com/hliang-SchrodiLab/SV_SNPs. Our analysis results represent a useful fine mapping tool for interrogating SVs in linkage disequilibrium with disease-associated SNPs. We anticipate that this resource may play an important role in subsequent studies which investigate incorporating disease causing SVs into disease risk prediction models.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    结构变体(SV,包括大规模插入,删除,倒置,和易位)显着影响微生物基因组中基因的功能,微生物组中的SVs和SVs与不同的生物过程和人类疾病有关。随着测序和生物信息学技术的进步,越来越多,测序数据和分析工具已经被广泛用于微生物组SV分析,导致对更多专用SV分析工作流程的更高需求。此外,由于各种测序技术的独特检测偏差,包括短读测序(例如Illumina平台)和长读测序(例如,牛津纳米孔和PacBio),基于多个平台的SV发现对于全面识别各种各样的SV是必要的。这里,我们建立了一个整合的管道MetaSV,结合了Nanopore长读数和Illumina短读数,以分析来自肠道微生物组的微生物基因组中的SV,并进一步鉴定可以反映代谢差异的差异SV。我们的管道使研究人员可以轻松访问微生物基因组中的SV和相关代谢物,而无需特定的技术专长,这对于对宏基因组SV感兴趣但缺乏复杂的生物信息学知识的研究人员特别有用。
    Structural variants (SVs, including large-scale insertions, deletions, inversions, and translocations) significantly impact the functions of genes in the microbial genome, and SVs in the microbiome are associated with diverse biological processes and human diseases. With the advancements in sequencing and bioinformatics technologies, increasingly, sequencing data and analysis tools are already being extensively utilized for microbiome SV analyses, leading to a higher demand for more dedicated SV analysis workflows. Moreover, due to the unique detection biases of various sequencing technologies, including short-read sequencing (such as Illumina platforms) and long-read sequencing (e.g., Oxford Nanopore and PacBio), SV discovery based on multiple platforms is necessary to comprehensively identify the wide variety of SVs. Here, we establish an integrated pipeline MetaSVs combining Nanopore long reads and Illumina short reads to analyze SVs in the microbial genomes from gut microbiome and further identify differential SVs that can be reflective of metabolic differences. Our pipeline provides researchers easy access to SVs and relevant metabolites in the microbial genomes without the requirement of specific technical expertise, which is particularly useful to researchers interested in metagenomic SVs but lacking sophisticated bioinformatic knowledge.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    拷贝数变体(CNV)是涉及大核苷酸序列的复制或缺失的全基因组结构变异。虽然这些类型的变化可以在人类中常见,已知大型和罕见的CNV有助于各种神经发育障碍(NDD)的发展,包括自闭症谱系障碍(ASD)。然而,考虑到这些具有NDD风险的CNVs覆盖了广泛的基因组区域,查明负责表型表现的关键基因尤其具有挑战性。在这项研究中,我们对来自SFARI数据库的11,614名NDD患者和4,031名对照的CNV数据进行了荟萃分析,以确定41个NDD风险CNV基因座,包括24个新区域。我们还发现,这些区域内的剂量敏感基因明显富集了已知的NDD风险基因和途径。此外,发现这些基因中有很大一部分是:i)在蛋白质-蛋白质相互作用网络中趋同;ii)是所有发育阶段大脑中表达最多的基因之一;iii)受到iHART队列中多重ASD家族中ASD个体显著过度传播的缺失的影响。最后,我们使用来自Decipher和iHART队列的4,281例NDD病例进行了负担分析,以及来自1,000个基因组和iHART的2,504个神经典型对照,这导致了162个剂量敏感基因驱动NDD风险的相关性的验证,包括22个新的NDD风险基因。重要的是,大多数NDD风险CNV基因座需要多个NDD风险基因,这与大多数NDD病例相关的多基因模型一致.
    Copy-number variants (CNVs) are genome-wide structural variations involving the duplication or deletion of large nucleotide sequences. While these types of variations can be commonly found in humans, large and rare CNVs are known to contribute to the development of various neurodevelopmental disorders (NDDs), including autism spectrum disorder (ASD). Nevertheless, given that these NDD-risk CNVs cover broad regions of the genome, it is particularly challenging to pinpoint the critical gene(s) responsible for the manifestation of the phenotype. In this study, we performed a meta-analysis of CNV data from 11,614 affected individuals with NDDs and 4,031 control individuals from SFARI database to identify 41 NDD-risk CNV loci, including 24 novel regions. We also found evidence for dosage-sensitive genes within these regions being significantly enriched for known NDD-risk genes and pathways. In addition, a significant proportion of these genes was found to (1) converge in protein-protein interaction networks, (2) be among most expressed genes in the brain across all developmental stages, and (3) be hit by deletions that are significantly over-transmitted to individuals with ASD within multiplex ASD families from the iHART cohort. Finally, we conducted a burden analysis using 4,281 NDD cases from Decipher and iHART cohorts, and 2,504 neurotypical control individuals from 1000 Genomes and iHART, which resulted in the validation of the association of 162 dosage-sensitive genes driving risk for NDDs, including 22 novel NDD-risk genes. Importantly, most NDD-risk CNV loci entail multiple NDD-risk genes in agreement with a polygenic model associated with the majority of NDD cases.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    全外显子组测序在不到50%的罕见疾病患者中发现致病突变,表明非编码基因组中存在额外的突变。到目前为止,在ClinVar数据库中列出的患有遗传疾病的个体中,非编码突变的数量不到0.2%,并且表现出高度多样化的分子机制.与我们对整个基因组进行测序的能力相反,我们发现和功能性确认此类非编码突变的能力严重落后.我们讨论了问题,并介绍了在深内含子序列中确认突变的例子,非编码三元组重复,增强器,和更大的结构变异,并强调他们提出的疾病机制。最后,我们讨论了在常规诊断中建立非编码突变检测所需的数据类型.
    Whole exome sequencing discovers causative mutations in less than 50 % of rare disease patients, suggesting the presence of additional mutations in the non-coding genome. So far, non-coding mutations have been identified in less than 0.2 % of individuals with genetic diseases listed in the ClinVar database and exhibit highly diverse molecular mechanisms. In contrast to our capability to sequence the whole genome, our ability to discover and functionally confirm such non-coding mutations is lagging behind severely. We discuss the problems and present examples of confirmed mutations in deep intronic sequences, non-coding triplet repeats, enhancers, and larger structural variants and highlight their proposed disease mechanisms. Finally, we discuss the type of data that would be required to establish non-coding mutation detection in routine diagnostics.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    高通量测序技术显著提高了单基因疾病患者的分子诊断率。这主要是由于编码序列中疾病突变的鉴定率大大提高。主要是SNV和indel。结构变体的检测和编码序列外变体的解释中的困难阻碍了进一步的进展。在这次审查中,我们概述了如何使用新的测序技术和最先进的算法来发现整个基因组中的小型和结构性变异,并引入生物信息学工具来预测变异可能在基因组非编码部分产生的影响.
    High-throughput sequencing techniques have significantly increased the molecular diagnosis rate for patients with monogenic disorders. This is primarily due to a substantially increased identification rate of disease mutations in the coding sequence, primarily SNVs and indels. Further progress is hampered by difficulties in the detection of structural variants and the interpretation of variants outside the coding sequence. In this review, we provide an overview about how novel sequencing techniques and state-of-the-art algorithms can be used to discover small and structural variants across the whole genome and introduce bioinformatic tools for the prediction of effects variants may have in the non-coding part of the genome.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    遗传性视网膜疾病(IRD)是一组罕见的单基因疾病,具有高遗传异质性(在280多个致病基因中鉴定出致病变异)。IRD的基因诊断率约为60%,主要归功于下一代测序(NGS)方法的常规应用,例如广泛的基因面板或全外显子组分析。据报道,全基因组测序(WGS)通过揭示难以捉摸的变异来提高这种诊断率,如结构变体(SV)和深内含子变体(DIV)。我们对33例疑似常染色体隐性IRD的未解决病例进行了WGS,旨在鉴定非编码区的致病遗传变异或检测初始筛选中未探索的SVs。大多数选定的病例(30/33,90.9%)在与临床表现相关的基因中携带单等位基因致病变异,因此我们首先分析了这些候选基因的非编码区.每当用这种方法没有发现其他致病变异时,我们将对SVs和DIV的搜索扩展到所有IRD相关基因。总的来说,我们在11例患者中发现了缺失的致病变异(11/33,33.3%).其中包括ABCA4,CEP290和RPGRIP1中的三个DIV;PROM1中的一个非规范剪接位点(NCSS)变体和EYS中的三个SV(大缺失),PCDH15和USH2A。对于先前未报道的CEP290中的DIV和PROM1中的NCCS变体,我们证实了通过逆转录(RT)-PCR对患者来源的RNA剪接的影响。这项研究证明了WGS作为多合一测试的功能和临床实用性,可识别标准NGS诊断方法错过的致病变异。
    Inherited retinal diseases (IRDs) are a group of rare monogenic diseases with high genetic heterogeneity (pathogenic variants identified in over 280 causative genes). The genetic diagnostic rate for IRDs is around 60%, mainly thanks to the routine application of next-generation sequencing (NGS) approaches such as extensive gene panels or whole exome analyses. Whole-genome sequencing (WGS) has been reported to improve this diagnostic rate by revealing elusive variants, such as structural variants (SVs) and deep intronic variants (DIVs). We performed WGS on 33 unsolved cases with suspected autosomal recessive IRD, aiming to identify causative genetic variants in non-coding regions or to detect SVs that were unexplored in the initial screening. Most of the selected cases (30 of 33, 90.9%) carried monoallelic pathogenic variants in genes associated with their clinical presentation, hence we first analyzed the non-coding regions of these candidate genes. Whenever additional pathogenic variants were not identified with this approach, we extended the search for SVs and DIVs to all IRD-associated genes. Overall, we identified the missing causative variants in 11 patients (11 of 33, 33.3%). These included three DIVs in ABCA4, CEP290 and RPGRIP1; one non-canonical splice site (NCSS) variant in PROM1 and three SVs (large deletions) in EYS, PCDH15 and USH2A. For the previously unreported DIV in CEP290 and for the NCCS variant in PROM1, we confirmed the effect on splicing by reverse transcription (RT)-PCR on patient-derived RNA. This study demonstrates the power and clinical utility of WGS as an all-in-one test to identify disease-causing variants missed by standard NGS diagnostic methodologies.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:在东非和南部非洲的大部分地区,按蚊是疟疾的主要媒介,然而,与冈比亚按蚊和阿拉伯按蚊等其他载体相比,其生态学和对载体控制的反应仍然知之甚少。这项研究首次对An中杀虫剂抗性的遗传和表型表达进行了大规模调查。坦桑尼亚的funestus人口。
    方法:我们对An进行了杀虫剂敏感性生物测定。坦桑尼亚9个疟疾流行中到高的地区的funestus蚊子,其次是抗性相关突变的基因分型(CYP6P9a,CYP6P9b,L119F-GSTe2)和结构变体(SV4.3kb,SV6.5kb)。使用广义线性模型来评估遗传标记与表型抗性之间的关系。创建了一个交互式RShiny工具来可视化数据并支持基于证据的干预措施。
    结果:拟除虫菊酯抗性是普遍的,但通过胡椒基丁醚(PBO)是可逆的。然而,在九个地区中只有五个地区观察到氨基甲酸酯抗性,和二氯-二苯基-三氯乙烷(DDT)抗性仅在基隆贝罗山谷中发现,坦桑尼亚东南部。相反,在所有地点都对有机磷吡米磷-甲基存在普遍敏感性。抗性的遗传标记具有不同的地理模式,具有CYP6P9a-R和CYP6P9b-R等位基因,和SV6.5kb结构变体在西北部不存在或检测不到,但在所有其他地点普遍存在,而SV4.3kb在西北和西部地区很普遍,但在其他地方却没有。与溴氰菊酯抗性相关的新兴L119F-GSTe2,在与莫桑比克接壤的地区以杂合子形式检测到,马拉维和刚果民主共和国。坦桑尼亚西部的抵抗景观最为复杂,在坦any尼喀区,在那里检测到所有五个遗传标记。有一个明显的从南向北传播的抗性基因,尤其是CYP6P9a-R,虽然这似乎被打断了,可能是裂谷。
    结论:本研究强调了扩大耐药性监测范围的必要性。funestus和其他媒介物种一起,并筛选抗性的遗传和表型特征。研究结果可以通过交互式用户界面在线可视化,并可以为数据驱动的阻力管理和媒介控制决策提供信息。因为这是坦桑尼亚安省第一次大规模的抵抗调查。funestus,我们建议定期更新,具有更大的地理和时间覆盖范围。
    BACKGROUND: Anopheles funestus is a leading vector of malaria in most parts of East and Southern Africa, yet its ecology and responses to vector control remain poorly understood compared with other vectors such as Anopheles gambiae and Anopheles arabiensis. This study presents the first large-scale survey of the genetic and phenotypic expression of insecticide resistance in An. funestus populations in Tanzania.
    METHODS: We performed insecticide susceptibility bioassays on An. funestus mosquitoes in nine regions with moderate-to-high malaria prevalence in Tanzania, followed by genotyping for resistance-associated mutations (CYP6P9a, CYP6P9b, L119F-GSTe2) and structural variants (SV4.3 kb, SV6.5 kb). Generalized linear models were used to assess relationships between genetic markers and phenotypic resistance. An interactive R Shiny tool was created to visualize the data and support evidence-based interventions.
    RESULTS: Pyrethroid resistance was universal but reversible by piperonyl-butoxide (PBO). However, carbamate resistance was observed in only five of the nine districts, and dichloro-diphenyl-trichloroethane (DDT) resistance was found only in the Kilombero valley, south-eastern Tanzania. Conversely, there was universal susceptibility to the organophosphate pirimiphos-methyl in all sites. Genetic markers of resistance had distinct geographical patterns, with CYP6P9a-R and CYP6P9b-R alleles, and the SV6.5 kb structural variant absent or undetectable in the north-west but prevalent in all other sites, while SV4.3 kb was prevalent in the north-western and western regions but absent elsewhere. Emergent L119F-GSTe2, associated with deltamethrin resistance, was detected in heterozygous form in districts bordering Mozambique, Malawi and the Democratic Republic of Congo. The resistance landscape was most complex in western Tanzania, in Tanganyika district, where all five genetic markers were detected. There was a notable south-to-north spread of resistance genes, especially CYP6P9a-R, though this appears to be interrupted, possibly by the Rift Valley.
    CONCLUSIONS: This study underscores the need to expand resistance monitoring to include An. funestus alongside other vector species, and to screen for both the genetic and phenotypic signatures of resistance. The findings can be visualized online via an interactive user interface and could inform data-driven decision-making for resistance management and vector control. Since this was the first large-scale survey of resistance in Tanzania\'s An. funestus, we recommend regular updates with greater geographical and temporal coverage.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    虽然短读测序目前主导着基因研究和诊断,它经常无法捕获某些结构变体(SV),这通常与神经发育障碍(NDD)的病因有关。光学基因组作图(OGM)是一种创新技术,能够通过短阅读方法捕获无法检测或难以检测的SV。本研究旨在研究使用OGM的NDD,特别关注标准外显子组测序后仍未解决的病例。使用超高分子量DNA在47个家族中进行OGM。单分子图谱从头组装,其次是SV和拷贝数变异调用。我们确定了7个感兴趣的变体,其中5人(10.6%)被归类为可能致病或致病,位于BCL11A,OPHN1,PHF8,SON,NFIA。我们还确定了破坏NAALADL2的倒位,该基因先前在两个NDD病例中被发现具有复杂的重排。已知NDD基因的变异体或外显子组测序遗漏的候选变异体主要由较大的插入体(>1kbp)组成。倒置,以及少量外显子(1-4个外显子)的缺失/重复。总之,除了改善NDD的分子诊断,该技术还可能揭示新的NDD基因,这些基因可能含有标准测序技术经常遗漏的复杂SVs.
    While short-read sequencing currently dominates genetic research and diagnostics, it frequently falls short of capturing certain structural variants (SVs), which are often implicated in the etiology of neurodevelopmental disorders (NDDs). Optical genome mapping (OGM) is an innovative technique capable of capturing SVs that are undetectable or challenging-to-detect via short-read methods. This study aimed to investigate NDDs using OGM, specifically focusing on cases that remained unsolved after standard exome sequencing. OGM was performed in 47 families using ultra-high molecular weight DNA. Single-molecule maps were assembled de novo, followed by SV and copy number variant calling. We identified 7 variants of interest, of which 5 (10.6%) were classified as likely pathogenic or pathogenic, located in BCL11A, OPHN1, PHF8, SON, and NFIA. We also identified an inversion disrupting NAALADL2, a gene which previously was found to harbor complex rearrangements in two NDD cases. Variants in known NDD genes or candidate variants of interest missed by exome sequencing mainly consisted of larger insertions (> 1kbp), inversions, and deletions/duplications of a low number of exons (1-4 exons). In conclusion, in addition to improving molecular diagnosis in NDDs, this technique may also reveal novel NDD genes which may harbor complex SVs often missed by standard sequencing techniques.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    帕金森氏病(PD)严重影响全球数百万人。尽管我们对PD的遗传基础的理解已经进步,导致疾病风险的遗传变异的很大一部分仍然未知。目前的PD遗传研究主要集中在一种形式的遗传变异上,单核苷酸变体(SNV),而其他重要形式的遗传变异,例如结构变体(SV),由于用传统测序方法检测这些变体的复杂性而被大多忽略。然而,这些形式的遗传变异在人脑的基因表达和调节中起着至关重要的作用,并且是许多神经系统疾病的原因,包括PD的形式。这篇综述旨在全面概述我们目前对编码和非编码SV在PD遗传结构中的参与的理解。
    Parkinson\'s disease (PD) significantly impacts millions of individuals worldwide. Although our understanding of the genetic foundations of PD has advanced, a substantial portion of the genetic variation contributing to disease risk remains unknown. Current PD genetic studies have primarily focused on one form of genetic variation, single nucleotide variants (SNVs), while other important forms of genetic variation, such as structural variants (SVs), are mostly ignored due to the complexity of detecting these variants with traditional sequencing methods. Yet, these forms of genetic variation play crucial roles in gene expression and regulation in the human brain and are causative of numerous neurological disorders, including forms of PD. This review aims to provide a comprehensive overview of our current understanding of the involvement of coding and noncoding SVs in the genetic architecture of PD.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号