关键词: Lineage assignment Next generation sequencing SARS-CoV-2 Variant calling Workflow

Mesh : SARS-CoV-2 / genetics Workflow Humans COVID-19 / virology epidemiology Genome, Viral Software Reproducibility of Results

来  源:   DOI:10.1186/s12864-024-10539-0   PDF(Pubmed)

Abstract:
BACKGROUND: At a global scale, the SARS-CoV-2 virus did not remain in its initial genotype for a long period of time, with the first global reports of variants of concern (VOCs) in late 2020. Subsequently, genome sequencing has become an indispensable tool for characterizing the ongoing pandemic, particularly for typing SARS-CoV-2 samples obtained from patients or environmental surveillance. For such SARS-CoV-2 typing, various in vitro and in silico workflows exist, yet to date, no systematic cross-platform validation has been reported.
RESULTS: In this work, we present the first comprehensive cross-platform evaluation and validation of in silico SARS-CoV-2 typing workflows. The evaluation relies on a dataset of 54 patient-derived samples sequenced with several different in vitro approaches on all relevant state-of-the-art sequencing platforms. Moreover, we present UnCoVar, a robust, production-grade reproducible SARS-CoV-2 typing workflow that outperforms all other tested approaches in terms of precision and recall.
CONCLUSIONS: In many ways, the SARS-CoV-2 pandemic has accelerated the development of techniques and analytical approaches. We believe that this can serve as a blueprint for dealing with future pandemics. Accordingly, UnCoVar is easily generalizable towards other viral pathogens and future pandemics. The fully automated workflow assembles virus genomes from patient samples, identifies existing lineages, and provides high-resolution insights into individual mutations. UnCoVar includes extensive quality control and automatically generates interactive visual reports. UnCoVar is implemented as a Snakemake workflow. The open-source code is available under a BSD 2-clause license at github.com/IKIM-Essen/uncovar.
摘要:
背景:在全球范围内,SARS-CoV-2病毒在很长一段时间内没有保持其初始基因型,2020年底首次发布全球关注变种(VOCs)报告。随后,基因组测序已成为表征正在进行的大流行的不可或缺的工具,特别是用于从患者或环境监测中获得的SARS-CoV-2样本的分型。对于这种SARS-CoV-2分型,存在各种体外和计算机工作流程,到目前为止,没有系统的跨平台验证报告.
结果:在这项工作中,我们提出了第一个全面的跨平台评估和验证silicoSARS-CoV-2分型工作流程。评估依赖于在所有相关的现有技术测序平台上用几种不同的体外方法测序的54个患者来源的样品的数据集。此外,我们介绍UnCoVar,一个健壮的,生产级可重复的SARS-CoV-2分型工作流程,在精确度和召回率方面优于所有其他测试方法。
结论:在许多方面,SARS-CoV-2大流行加速了技术和分析方法的发展。我们认为,这可以作为应对未来流行病的蓝图。因此,UnCoVar很容易推广到其他病毒病原体和未来的大流行。全自动工作流程从患者样本中组装病毒基因组,识别现有的血统,并提供对个体突变的高分辨率见解。UnCoVar包括广泛的质量控制,并自动生成交互式可视化报告。UnCoVar作为Snakemake工作流实现。开源代码可在github.com/IKIM-Essen/uncovar上获得BSD2条款许可。
公众号