关键词: SARS-CoV-2 benchmark environmental sequencing surveillance wastewater

Mesh : Wastewater / virology SARS-CoV-2 / genetics classification COVID-19 / virology epidemiology Humans Genome, Viral Computational Biology / methods Genomics / methods Wastewater-Based Epidemiological Monitoring Phylogeny

来  源:   DOI:10.1099/mgen.0.001249   PDF(Pubmed)

Abstract:
Wastewater-based surveillance (WBS) is an important epidemiological and public health tool for tracking pathogens across the scale of a building, neighbourhood, city, or region. WBS gained widespread adoption globally during the SARS-CoV-2 pandemic for estimating community infection levels by qPCR. Sequencing pathogen genes or genomes from wastewater adds information about pathogen genetic diversity, which can be used to identify viral lineages (including variants of concern) that are circulating in a local population. Capturing the genetic diversity by WBS sequencing is not trivial, as wastewater samples often contain a diverse mixture of viral lineages with real mutations and sequencing errors, which must be deconvoluted computationally from short sequencing reads. In this study we assess nine different computational tools that have recently been developed to address this challenge. We simulated 100 wastewater sequence samples consisting of SARS-CoV-2 BA.1, BA.2, and Delta lineages, in various mixtures, as well as a Delta-Omicron recombinant and a synthetic \'novel\' lineage. Most tools performed well in identifying the true lineages present and estimating their relative abundances and were generally robust to variation in sequencing depth and read length. While many tools identified lineages present down to 1 % frequency, results were more reliable above a 5 % threshold. The presence of an unknown synthetic lineage, which represents an unclassified SARS-CoV-2 lineage, increases the error in relative abundance estimates of other lineages, but the magnitude of this effect was small for most tools. The tools also varied in how they labelled novel synthetic lineages and recombinants. While our simulated dataset represents just one of many possible use cases for these methods, we hope it helps users understand potential sources of error or bias in wastewater sequencing analysis and to appreciate the commonalities and differences across methods.
摘要:
基于废水的监测(WBS)是一种重要的流行病学和公共卫生工具,用于跟踪建筑物范围内的病原体。邻里,城市,或地区。在SARS-CoV-2大流行期间,WBS在全球范围内获得了广泛采用,用于通过qPCR估算社区感染水平。对废水中的病原体基因或基因组进行测序增加了有关病原体遗传多样性的信息,可用于鉴定在当地人群中传播的病毒谱系(包括相关变体)。通过WBS测序捕获遗传多样性并不简单,因为废水样本通常包含具有真实突变和测序错误的病毒谱系的不同混合物,必须从短测序读取中计算解卷积。在这项研究中,我们评估了最近为应对这一挑战而开发的九种不同的计算工具。我们模拟了100个由SARS-CoV-2BA.1,BA.2和Delta谱系组成的废水序列样品,在各种混合物中,以及Delta-Omicron重组体和合成的“新型”谱系。大多数工具在鉴定存在的真实谱系和估计它们的相对丰度方面表现良好,并且通常对测序深度和读取长度的变化是稳健的。虽然许多工具识别谱系出现的频率低至1%,结果在5%阈值以上更可靠。一个未知的合成谱系的存在,它代表了一个未分类的SARS-CoV-2谱系,增加了其他谱系的相对丰度估计的误差,但是对于大多数工具来说,这种影响的幅度很小。这些工具在如何标记新的合成谱系和重组体方面也有所不同。虽然我们的模拟数据集仅代表这些方法的许多可能用例之一,我们希望它能帮助用户了解废水测序分析中错误或偏差的潜在来源,并了解不同方法的共同点和差异。
公众号