关键词: SARS-CoV-2 variant calling

Mesh : Humans SARS-CoV-2 / genetics COVID-19 / epidemiology Pandemics Workflow Computational Biology

来  源:   DOI:10.3390/v16030430   PDF(Pubmed)

Abstract:
Genomic sequencing of clinical samples to identify emerging variants of SARS-CoV-2 has been a key public health tool for curbing the spread of the virus. As a result, an unprecedented number of SARS-CoV-2 genomes were sequenced during the COVID-19 pandemic, which allowed for rapid identification of genetic variants, enabling the timely design and testing of therapies and deployment of new vaccine formulations to combat the new variants. However, despite the technological advances of deep sequencing, the analysis of the raw sequence data generated globally is neither standardized nor consistent, leading to vastly disparate sequences that may impact identification of variants. Here, we show that for both Illumina and Oxford Nanopore sequencing platforms, downstream bioinformatic protocols used by industry, government, and academic groups resulted in different virus sequences from same sample. These bioinformatic workflows produced consensus genomes with differences in single nucleotide polymorphisms, inclusion and exclusion of insertions, and/or deletions, despite using the same raw sequence as input datasets. Here, we compared and characterized such discrepancies and propose a specific suite of parameters and protocols that should be adopted across the field. Consistent results from bioinformatic workflows are fundamental to SARS-CoV-2 and future pathogen surveillance efforts, including pandemic preparation, to allow for a data-driven and timely public health response.
摘要:
对临床样本进行基因组测序以鉴定SARS-CoV-2的新变体,一直是遏制病毒传播的关键公共卫生工具。因此,在COVID-19大流行期间,对数量空前的SARS-CoV-2基因组进行了测序,可以快速鉴定遗传变异,能够及时设计和测试疗法,并部署新的疫苗配方,以对抗新的变种。然而,尽管深度测序的技术进步,对全球生成的原始序列数据的分析既不标准化也不一致,导致可能影响变体鉴定的完全不同的序列。这里,我们表明,对于Illumina和Oxford纳米孔测序平台,工业使用的下游生物信息学协议,政府,和学术团体从同一样本中得出不同的病毒序列。这些生物信息学工作流程产生了单核苷酸多态性差异的共有基因组,插入的包含和排除,和/或删除,尽管使用相同的原始序列作为输入数据集。这里,我们比较和表征了这种差异,并提出了一套具体的参数和协议,应在整个领域采用。生物信息学工作流程的一致结果是SARS-CoV-2和未来病原体监测工作的基础,包括大流行准备,允许数据驱动和及时的公共卫生响应。
公众号