mutational signatures

突变特征
  • 文章类型: Journal Article
    癌症中体细胞变异的准确检测和分析涉及多个具有复杂依赖关系和配置的第三方工具,导致费力,容易出错,和耗时的数据转换。这种方法缺乏准确性,再现性,和便携性,限制临床应用。Musta的开发是为了解决这些问题,作为一个端到端的检测管道,分类,解释癌症突变。Musta基于Python命令行工具,旨在管理肿瘤正常样本以进行精确的体细胞突变分析。核心是基于Snakemake的工作流程,涵盖了所有关键的癌症基因组学步骤,包括变体调用,突变签名反卷积,变体注释,驱动基因检测,途径分析,和肿瘤异质性估计。Musta很容易通过Docker安装在任何系统上,使用Makefile处理安装,配置,和执行,允许全部或部分管道运行。Musta已在CRS4-NGS核心设施进行了验证,并在癌症基因组图谱和北京基因组研究所的大型数据集上进行了测试。Musta已被证明对癌症中的体细胞变异分析具有鲁棒性和灵活性。它是用户友好的,不需要专门的编程技能,并支持使用单个命令行进行数据处理。它的再现性确保一致的结果跨用户遵循相同的协议。
    Accurate detection and analysis of somatic variants in cancer involve multiple third-party tools with complex dependencies and configurations, leading to laborious, error-prone, and time-consuming data conversions. This approach lacks accuracy, reproducibility, and portability, limiting clinical application. Musta was developed to address these issues as an end-to-end pipeline for detecting, classifying, and interpreting cancer mutations. Musta is based on a Python command-line tool designed to manage tumor-normal samples for precise somatic mutation analysis. The core is a Snakemake-based workflow that covers all key cancer genomics steps, including variant calling, mutational signature deconvolution, variant annotation, driver gene detection, pathway analysis, and tumor heterogeneity estimation. Musta is easy to install on any system via Docker, with a Makefile handling installation, configuration, and execution, allowing for full or partial pipeline runs. Musta has been validated at the CRS4-NGS Core facility and tested on large datasets from The Cancer Genome Atlas and the Beijing Institute of Genomics. Musta has proven robust and flexible for somatic variant analysis in cancer. It is user-friendly, requiring no specialized programming skills, and enables data processing with a single command line. Its reproducibility ensures consistent results across users following the same protocol.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    细胞的基因组不断受到过多的外源和内源过程的打击,这些过程可能导致DNA受损。修复机制大部分时间纠正这种损坏,但如果不这样做就会导致突变。突变不会以随机方式发生,而是通常由于已知或估算的突变过程而遵循或多或少的特定模式。突变特征分析是可以推断癌症的主要突变过程的过程,并且可以在多种情况下用于研究癌症的起源及其对治疗的反应。最近的泛癌症基因组研究,如“癌症基因组图谱”已经确定了许多突变特征,可以归类为单碱基取代,双碱基替换,或小的插入/删除。了解这些在非小细胞肺癌中发生的突变特征可以改善预防工作。预测对个性化治疗的治疗反应,并指导靶向肿瘤进化的治疗方法的发展。对于非小细胞肺癌,一些突变特征已被确定,与暴露相关,如吸烟和氡,也可以反映内源性过程,如老化,APOBEC活动,和失配修复的损失。在这里,我们概述了非小细胞肺癌突变特征的现有知识.
    The genome of a cell is continuously battered by a plethora of exogenous and endogenous processes that can lead to damaged DNA. Repair mechanisms correct this damage most of the time, but failure to do so leaves mutations. Mutations do not occur in random manner, but rather typically follow a more or less specific pattern due to known or imputed mutational processes. Mutational signature analysis is the process by which the predominant mutational process can be inferred for a cancer and can be used in several contexts to study both the genesis of cancer and its response to therapy. Recent pan-cancer genomic efforts such as \"The Cancer Genome Atlas\" have identified numerous mutational signatures that can be categorized into single base substitutions, doublet base substitutions, or small insertions/deletions. Understanding these mutational signatures as they occur in non-small lung cancer could improve efforts at prevention, predict treatment response to personalized treatments, and guide the development of therapies targeting tumor evolution. For non-small cell lung cancer, several mutational signatures have been identified that correlate with exposures such as tobacco smoking and radon and can also reflect endogenous processes such as aging, APOBEC activity, and loss of mismatch repair. Herein, we provide an overview of the current knowledge of mutational signatures in non-small lung cancer.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    由APOBEC3胞苷脱氨酶活性引起的突变模式在整个人类癌症基因组中是明显的。特别是,APOBEC3A家族成员是一种有效的基因毒素,在实验系统和人类肿瘤中会引起大量DNA损伤。然而,在具有活性APOBEC3A的细胞中确保基因组稳定性的机制尚不清楚.通过一个公正的全基因组屏幕,当APOBEC3A具有活性时,我们将染色体5/6(SMC5/6)复合物的结构维持定义为细胞活力所必需的。我们观察到在SMC5/6功能障碍的人类肿瘤中缺乏APOBEC3A诱变,与合成致死性一致.耗尽SMC5/6的癌细胞在DNA复制过程中由于APOBEC3A活性而引起大量基因组损伤。Further,APOBEC3A活性导致复制道延长,这取决于PrimPol,与APOBEC3A诱导的病变下游DNA合成的重新开始一致。SMC5/6的缺失消除了延长的复制束,并增加了APOBEC3A活性时的DNA断裂。我们的发现表明,复制叉延长反映了对APOBEC3A活性的DNA损伤反应,以SMC5/6依赖性方式促进基因组稳定性。因此,SMC5/6在具有活性APOBEC3A的肿瘤中呈现潜在的治疗脆弱性。
    Mutational patterns caused by APOBEC3 cytidine deaminase activity are evident throughout human cancer genomes. In particular, the APOBEC3A family member is a potent genotoxin that causes substantial DNA damage in experimental systems and human tumors. However, the mechanisms that ensure genome stability in cells with active APOBEC3A are unknown. Through an unbiased genome-wide screen, we define the Structural Maintenance of Chromosomes 5/6 (SMC5/6) complex as essential for cell viability when APOBEC3A is active. We observe an absence of APOBEC3A mutagenesis in human tumors with SMC5/6 dysfunction, consistent with synthetic lethality. Cancer cells depleted of SMC5/6 incur substantial genome damage from APOBEC3A activity during DNA replication. Further, APOBEC3A activity results in replication tract lengthening which is dependent on PrimPol, consistent with re-initiation of DNA synthesis downstream of APOBEC3A-induced lesions. Loss of SMC5/6 abrogates elongated replication tracts and increases DNA breaks upon APOBEC3A activity. Our findings indicate that replication fork lengthening reflects a DNA damage response to APOBEC3A activity that promotes genome stability in an SMC5/6-dependent manner. Therefore, SMC5/6 presents a potential therapeutic vulnerability in tumors with active APOBEC3A.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    虽然突变特征提供了大量的预后和治疗见解,它们在临床上的应用,靶向基因面板是极其有限的。我们开发了一个突变表示模型(该模型学习并嵌入特定的突变签名连接),该模型能够预测仅具有少数突变的显性签名。我们用基因面板预测了超过60,000种肿瘤的显性特征,在不同的癌症中描绘他们的景观。基因面板中的显性特征预测具有临床重要性。这些包括UV,烟草,和载脂蛋白BmRNA编辑酶,与更好的存活率相关的催化多肽(APOBEC)特征,独立于突变负担。进一步的分析揭示了基因和突变与特征的关联,如SBS5与TP53和APOBEC与FGFR3S249C。在临床用例中,APOBEC特征是表皮生长因子受体-酪氨酸激酶抑制剂(EGFR-TKIs)耐药性的稳健和特异性预测指标。我们的模型提供了一种易于使用的方法来检测临床环境分析中的特征,对前所未有的癌症患者数量具有许多可能的临床意义。
    While mutational signatures provide a plethora of prognostic and therapeutic insights, their application in clinical-setting, targeted gene panels is extremely limited. We develop a mutational representation model (which learns and embeds specific mutation signature connections) that enables prediction of dominant signatures with only a few mutations. We predict the dominant signatures across more than 60,000 tumors with gene panels, delineating their landscape across different cancers. Dominant signature predictions in gene panels are of clinical importance. These included UV, tobacco, and apolipoprotein B mRNA editing enzyme, catalytic polypeptide (APOBEC) signatures that are associated with better survival, independently from mutational burden. Further analyses reveal gene and mutation associations with signatures, such as SBS5 with TP53 and APOBEC with FGFR3S249C. In a clinical use case, APOBEC signature is a robust and specific predictor for resistance to epidermal growth factor receptor-tyrosine kinase inhibitors (EGFR-TKIs). Our model provides an easy-to-use way to detect signatures in clinical setting assays with many possible clinical implications for an unprecedented number of cancer patients.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    癌症是进化事件的产物,分子变异发生并在组织和肿瘤中积累。这种分子变异的测序不仅告知哪些变异正在驱动肿瘤发生,而且是什么是助长诱变背后的机制。这两个细节对于防止癌症导致的过早死亡至关重要,无论是通过靶向驱动癌症表型的变体,还是通过防止外源突变促进体细胞进化的措施。这里,我们回顾了确定分子特征和癌症驱动因素的工具,以及可以链接这些指标的途径。
    Cancers are the product of evolutionary events, where molecular variation occurs and accumulates in tissues and tumors. Sequencing of this molecular variation informs not only which variants are driving tumorigenesis, but also the mechanisms behind what is fueling mutagenesis. Both of these details are crucial for preventing premature deaths due to cancer, whether it is by targeting the variants driving the cancer phenotype or by measures to prevent exogenous mutations from contributing to somatic evolution. Here, we review tools to determine both molecular signatures and cancer drivers, and avenues by which these metrics may be linked.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Preprint
    形成低填充的短寿命DNA构象的倾向可能随序列而变化,在生化反应中提供序列特异性的重要来源。然而,全面测量这些动态如何随序列变化是具有挑战性的。使用1HCEST和13CR1ρNMR,我们在13个三核苷酸序列环境中测量了A-T碱基对的Watson-Crick到Hoogsteen动力学。Hoogsteen人口和汇率分别为4倍和16倍,分别,并且依赖于3'和5'邻居,但仅弱依赖于单价离子浓度(25对100mMNaCl)和pH(6.8对8.0)。灵活的TA和CA二核苷酸步骤表现出最高的Hoogsteen种群,它们的动力学速率强烈依赖于3'-邻居。相比之下,较硬的AA和GA台阶的Hoogsteen种群最低,它们的动力学对3'邻居的依赖性较弱。当G-C邻居侧翼于A-T碱基对时,Hoogsteen的寿命特别短。与双链体稳定性和小沟宽度相比,Hoogsteen动力学具有明显的序列依赖性。因此,我们的结果揭示了以A-THoogsteen动力学形式隐藏在DNA双螺旋中的序列特异性的独特来源,并建立了1HCEST定量测量序列依赖性DNA动力学的实用性.
    The propensities to form lowly-populated short-lived conformations of DNA could vary with sequence, providing an important source of sequence-specificity in biochemical reactions. However, comprehensively measuring how these dynamics vary with sequence is challenging. Using 1H CEST and 13C R1ρ NMR, we measured Watson-Crick to Hoogsteen dynamics for an A-T base pair in thirteen trinucleotide sequence contexts. The Hoogsteen population and exchange rate varied 4-fold and 16-fold, respectively, and were dependent on both the 3\'- and 5\'-neighbors but only weakly dependent on monovalent ion concentration (25 versus 100 mM NaCl) and pH (6.8 versus 8.0). Flexible TA and CA dinucleotide steps exhibited the highest Hoogsteen populations, and their kinetics rates strongly depended on the 3\'-neighbor. In contrast, the stiffer AA and GA steps had the lowest Hoogsteen population, and their kinetics were weakly dependent on the 3\'-neighbor. The Hoogsteen lifetime was especially short when G-C neighbors flanked the A-T base pair. The Hoogsteen dynamics had a distinct sequence-dependence compared to duplex stability and minor groove width. Thus, our results uncover a unique source of sequence-specificity hidden within the DNA double helix in the form of A-T Hoogsteen dynamics and establish the utility of 1H CEST to quantitively measure sequence-dependent DNA dynamics.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    多发性骨髓瘤(MM)是第二常见的血液系统恶性肿瘤,尽管最近在治疗策略方面取得了进展,但仍无法治愈。像其他形式的癌症一样,MM的特征是基因组不稳定,由DNA修复缺陷引起的。随着DNA修复基因的突变和用于治疗MM的基因毒性药物,非经典二级DNA结构(四链G-四链体结构)可影响MM患者肿瘤细胞中体细胞突变和染色体异常的积累。这里,我们检验了以下假设:G-四链体结构可能影响MM患者肿瘤细胞中体细胞突变的分布。我们对11名MM患者的正常和肿瘤细胞的外显子组进行了测序,并分析了围绕体细胞突变点的G4背景存在的数据。为了确定可能影响肿瘤突变谱的分子机制,我们还分析了肿瘤细胞中的突变特征以及种系突变是否存在DNA修复基因或调节G-四链体解链的基因中的特定SNP.在几个患者中,我们发现体细胞突变的位点通常位于G4背景区域.这种模式与在这些患者中发现的特定种系变异相关。我们讨论了这些变体对MM中突变积累和特异性的可能影响,并提出围绕体细胞突变位点的G4背景富集程度可能是表征肿瘤突变过程的新指标。
    Multiple myeloma (MM) is the second most common hematological malignancy, which remains incurable despite recent advances in treatment strategies. Like other forms of cancer, MM is characterized by genomic instability, caused by defects in DNA repair. Along with mutations in DNA repair genes and genotoxic drugs used to treat MM, non-canonical secondary DNA structures (four-stranded G-quadruplex structures) can affect accumulation of somatic mutations and chromosomal abnormalities in the tumor cells of MM patients. Here, we tested the hypothesis that G-quadruplex structures may influence the distribution of somatic mutations in the tumor cells of MM patients. We sequenced exomes of normal and tumor cells of 11 MM patients and analyzed the data for the presence of G4 context around points of somatic mutations. To identify molecular mechanisms that could affect mutational profile of tumors, we also analyzed mutational signatures in tumor cells as well as germline mutations for the presence of specific SNPs in DNA repair genes or in genes regulating G-quadruplex unwinding. In several patients, we found that sites of somatic mutations are frequently located in regions with G4 context. This pattern correlated with specific germline variants found in these patients. We discuss the possible implications of these variants for mutation accumulation and specificity in MM and propose that the extent of G4 context enrichment around somatic mutation sites may be a novel metric characterizing mutational processes in tumors.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    早发性乳腺癌(EoBC),由<40岁的诊断定义,与预后不良有关。这项研究使用来自艾伯塔省的100个肿瘤样本调查了非转移性EoBC的突变景观和突变特征的预后相关性。加拿大。R/Bioconductor中的MutationalPatterns软件包用于提取从头单碱基替换(SBS)和插入-缺失(indel)突变签名,并适合COSMICSBS和indel签名。我们评估了这些特征与疾病临床特征之间的关联,除了无复发(RFS)和总生存期(OS)。提取了五个SBS和两个indel签名。SBS13样特征在HER2富集亚型中具有较高的相对贡献。在调整其他预后因素后,贡献高于中位数的患者倾向于具有更好的RFS(HR=0.29;95%CI:0.08-1.06)。基于绝对贡献的无监督聚类算法揭示了三个拟合的COSMICSBS签名簇,但聚类成员与临床变量或生存结局无关.这项探索性研究的结果表明,各种SBS和indel特征可能与疾病的临床特征和预后有关。未来需要进行更大样本的研究,以更好地了解EoBC中疾病进展和治疗反应的机制基础。
    Early-onset breast cancer (EoBC), defined by a diagnosis <40 years of age, is associated with poor prognosis. This study investigated the mutational landscape of non-metastatic EoBC and the prognostic relevance of mutational signatures using 100 tumour samples from Alberta, Canada. The MutationalPatterns package in R/Bioconductor was used to extract de novo single-base substitution (SBS) and insertion-deletion (indel) mutational signatures and to fit COSMIC SBS and indel signatures. We assessed associations between these signatures and clinical characteristics of disease, in addition to recurrence-free (RFS) and overall survival (OS). Five SBS and two indel signatures were extracted. The SBS13-like signature had higher relative contributions in the HER2-enriched subtype. Patients with higher than median contribution tended to have better RFS after adjustment for other prognostic factors (HR = 0.29; 95% CI: 0.08-1.06). An unsupervised clustering algorithm based on absolute contribution revealed three clusters of fitted COSMIC SBS signatures, but cluster membership was not associated with clinical variables or survival outcomes. The results of this exploratory study reveal various SBS and indel signatures may be associated with clinical features of disease and prognosis. Future studies with larger samples are required to better understand the mechanistic underpinnings of disease progression and treatment response in EoBC.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    肿瘤突变特征在癌症研究中得到了重视,然而,缺乏标准化的方法阻碍了可重复性和稳健性。利用结直肠癌(CRC)作为模型,我们研究了计算参数对230例CRC细胞系和152例CRC患者突变特征分析的影响.结果在三个独立的数据集中进行了验证:483例按错配修复(MMR)状态分层的子宫内膜癌患者,35例肺癌患者的吸烟状况和12例患者来源的类器官(PDO)的大肠杆菌素暴露。评估各种生物信息学工具,参考数据集和输入数据大小,包括全基因组测序,整个外显子组测序和一个泛癌基因小组,我们在结果中证明了显著的变异性.我们报告说,使用不同的算法和参考导致统计上不同的结果,强调任意选择如何诱导突变特征贡献的可变性。此外,我们发现了编码区和基因间区之间突变特征的不同贡献,并确定了可靠突变特征分配所需的最小体细胞变异数.为了便于识别最合适的工作流程,我们开发了编码区和基因外区的比较突变特征分析(CoMSCER),一种生物信息学工具,它允许研究人员通过耦合来自多个工具和公共参考数据集的结果来轻松地进行比较突变特征分析,并评估编码和非编码基因组区域的突变特征贡献。总之,我们的研究提供了一个比较框架来阐明不同的计算工作流程对突变特征的影响.
    Tumor mutational signatures have gained prominence in cancer research, yet the lack of standardized methods hinders reproducibility and robustness. Leveraging colorectal cancer (CRC) as a model, we explored the influence of computational parameters on mutational signature analyses across 230 CRC cell lines and 152 CRC patients. Results were validated in three independent datasets: 483 endometrial cancer patients stratified by mismatch repair (MMR) status, 35 lung cancer patients by smoking status and 12 patient-derived organoids (PDOs) annotated for colibactin exposure. Assessing various bioinformatic tools, reference datasets and input data sizes including whole genome sequencing, whole exome sequencing and a pan-cancer gene panel, we demonstrated significant variability in the results. We report that the use of distinct algorithms and references led to statistically different results, highlighting how arbitrary choices may induce variability in the mutational signature contributions. Furthermore, we found a differential contribution of mutational signatures between coding and intergenic regions and defined the minimum number of somatic variants required for reliable mutational signature assignment. To facilitate the identification of the most suitable workflows, we developed Comparative Mutational Signature analysis on Coding and Extragenic Regions (CoMSCER), a bioinformatic tool which allows researchers to easily perform comparative mutational signature analysis by coupling the results from several tools and public reference datasets and to assess mutational signature contributions in coding and non-coding genomic regions. In conclusion, our study provides a comparative framework to elucidate the impact of distinct computational workflows on mutational signatures.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    所有肺癌中约有80%至85%是非小细胞肺癌(NSCLC)。先前的研究旨在通过个体方法探索NSCLC的遗传基础,但是研究尚未调查将它们结合起来的结果。在这里,我们表明通过三种方法分析NSCLC遗传学同时为我们对疾病的理解创造了独特的见解。通过结合以前的研究和生物信息学工具,我们确定了35个NSCLC候选基因。我们用3种不同的方法分析了这些基因。首先,我们发现了这些候选基因之间的基因融合。第二,我们发现了基因之间常见的超家族。最后,我们确定了可能与NSCLC相关的突变特征.每种方法都有其个人,独特的结果。融合关系确定特定的基因融合靶标,常见的超家族确定了确定新靶基因的可能途径,鉴定NSCLC相关突变特征具有诊断和预后益处。结合这些方法,我们发现CD74基因具有显著的融合关系,但它与其他两种方法无关,提示CD74与NSCLC主要是由于其融合关系。靶向CD74的基因融合可能是另一种NSCLC治疗方法。这种遗传分析确实创造了对NSCLC基因的独特见解。来自每种方法的单独和组合的两种结果允许对这种癌症寻求更有效的治疗策略。提出的方法也可以适用于其他癌症,创造当前分析方法找不到的见解。
    Around 80 to 85% of all lung cancers are non-small cell lung cancer (NSCLC). Previous research has aimed at exploring the genetic basis of NSCLC through individual approaches, but studies have yet to investigate the results of combining them. Here we show that analyzing NSCLC genetics through three approaches simultaneously creates unique insights into our understanding of the disease. Through a combination of previous research and bioinformatics tools, we determined 35 NSCLC candidate genes. We analyzed these genes in 3 different approaches. First, we found the gene fusions between these candidate genes. Second, we found the common superfamilies between genes. Finally, we identified mutational signatures that are possibly associated with NSCLC. Each approach has its individual, unique results. Fusion relationships identify specific gene fusion targets, common superfamilies identify possible avenues to determine novel target genes, and identifying NSCLC associated mutational signatures has diagnostic and prognostic benefits. Combining the approaches, we found that gene CD74 has significant fusion relationships, but it has no association with the other two approaches, suggesting that CD74 is associated with NSCLC mainly because of its fusion relationships. Targeting the gene fusions of CD74 may be an alternative NSCLC treatment. This genetic analysis has indeed created unique insight into NSCLC genes. Both the results from each of the approaches separately and combined allow pursuit of more effective treatment strategies for this cancer. The methodology presented can also apply to other cancers, creating insights that current analytical methods could not find.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号