关键词: cancer machine learning mutational patterns mutational signatures precision medicine somatic mutations somatic variant detection

Mesh : Humans Neoplasms / genetics diagnosis Mutation Software Genomics / methods Reproducibility of Results High-Throughput Nucleotide Sequencing / methods Computational Biology / methods DNA Mutational Analysis / methods economics

来  源:   DOI:10.3390/ijms25158044   PDF(Pubmed)

Abstract:
Accurate detection and analysis of somatic variants in cancer involve multiple third-party tools with complex dependencies and configurations, leading to laborious, error-prone, and time-consuming data conversions. This approach lacks accuracy, reproducibility, and portability, limiting clinical application. Musta was developed to address these issues as an end-to-end pipeline for detecting, classifying, and interpreting cancer mutations. Musta is based on a Python command-line tool designed to manage tumor-normal samples for precise somatic mutation analysis. The core is a Snakemake-based workflow that covers all key cancer genomics steps, including variant calling, mutational signature deconvolution, variant annotation, driver gene detection, pathway analysis, and tumor heterogeneity estimation. Musta is easy to install on any system via Docker, with a Makefile handling installation, configuration, and execution, allowing for full or partial pipeline runs. Musta has been validated at the CRS4-NGS Core facility and tested on large datasets from The Cancer Genome Atlas and the Beijing Institute of Genomics. Musta has proven robust and flexible for somatic variant analysis in cancer. It is user-friendly, requiring no specialized programming skills, and enables data processing with a single command line. Its reproducibility ensures consistent results across users following the same protocol.
摘要:
癌症中体细胞变异的准确检测和分析涉及多个具有复杂依赖关系和配置的第三方工具,导致费力,容易出错,和耗时的数据转换。这种方法缺乏准确性,再现性,和便携性,限制临床应用。Musta的开发是为了解决这些问题,作为一个端到端的检测管道,分类,解释癌症突变。Musta基于Python命令行工具,旨在管理肿瘤正常样本以进行精确的体细胞突变分析。核心是基于Snakemake的工作流程,涵盖了所有关键的癌症基因组学步骤,包括变体调用,突变签名反卷积,变体注释,驱动基因检测,途径分析,和肿瘤异质性估计。Musta很容易通过Docker安装在任何系统上,使用Makefile处理安装,配置,和执行,允许全部或部分管道运行。Musta已在CRS4-NGS核心设施进行了验证,并在癌症基因组图谱和北京基因组研究所的大型数据集上进行了测试。Musta已被证明对癌症中的体细胞变异分析具有鲁棒性和灵活性。它是用户友好的,不需要专门的编程技能,并支持使用单个命令行进行数据处理。它的再现性确保一致的结果跨用户遵循相同的协议。
公众号