关键词: 16S rRNA Escherichia coli Salmonella food-borne pathogens long-read sequencing serotyping

Mesh : Serogroup RNA, Ribosomal, 16S / genetics Phylogeny Escherichia coli / genetics Genes, rRNA Salmonella / genetics Salmonella enterica / genetics

来  源:   DOI:10.1128/msystems.00757-23   PDF(Pubmed)

Abstract:
The resolution of variation within species is critical for interpreting and acting on many microbial measurements. In the key foodborne pathogens Salmonella and Escherichia coli, the primary subspecies classification scheme used is serotyping: differentiating variants within these species by surface antigen profiles. Serotype prediction from whole-genome sequencing (WGS) of isolates is now seen as comparable or preferable to traditional laboratory methods where WGS is available. However, laboratory and WGS methods depend on an isolation step that is time-consuming and incompletely represents the sample when multiple strains are present. Community sequencing approaches that skip the isolation step are, therefore, of interest for pathogen surveillance. Here, we evaluated the viability of amplicon sequencing of the full-length 16S rRNA gene for serotyping Salmonella enterica and E. coli. We developed a novel algorithm for serotype prediction, implemented as an R package (Seroplacer), which takes as input full-length 16S rRNA gene sequences and outputs serovar predictions after phylogenetic placement into a reference phylogeny. We achieved over 89% accuracy in predicting Salmonella serotypes on in silico test data and identified key pathogenic serovars of Salmonella and E. coli in isolate and environmental test samples. Although serotype prediction from 16S rRNA gene sequences is not as accurate as serotype prediction from WGS of isolates, the potential to identify dangerous serovars directly from amplicon sequencing of environmental samples is intriguing for pathogen surveillance. The capabilities developed here are also broadly relevant to other applications where intraspecies variation and direct sequencing from environmental samples could be valuable.IMPORTANCEIn order to prevent and stop outbreaks of foodborne pathogens, it is important that we can detect when pathogenic bacteria are present in a food or food-associated site and identify connections between specific pathogenic bacteria present in different samples. In this work, we develop a new computational technology that allows the important foodborne pathogens Escherichia coli and Salmonella enterica to be serotyped (a subspecies level classification) from sequencing of a single-marker gene, and the 16S rRNA gene often used to surveil bacterial communities. Our results suggest current limitations to serotyping from 16S rRNA gene sequencing alone but set the stage for further progress that we consider likely given the rapid advance in the long-read sequencing technologies and genomic databases our work leverages. If this research direction succeeds, it could enable better detection of foodborne pathogens before they reach the public and speed the resolution of foodborne pathogen outbreaks.
摘要:
物种内变异的分辨率对于解释和作用于许多微生物测量至关重要。在重点食源性致病菌沙门氏菌和大肠杆菌中,使用的主要亚种分类方案是血清分型:通过表面抗原谱区分这些物种中的变异。从分离物全基因组测序(WGS)的血清型预测现在被视为与WGS可用的传统实验室方法相当或优选。然而,实验室和WGS方法取决于分离步骤,该步骤耗时且当存在多个菌株时不完全代表样品。跳过隔离步骤的社区测序方法是,因此,对病原体监测感兴趣。这里,我们评估了全长16SrRNA基因的扩增子测序对沙门氏菌和大肠杆菌血清分型的可行性。我们开发了一种新的血清型预测算法,实现为R包(Seroplacer),将全长16SrRNA基因序列作为输入,并在系统发育放置到参考系统发育中后输出血清型预测。我们在计算机模拟试验数据上预测沙门氏菌血清型方面取得了超过89%的准确率,并在分离物和环境试验样品中确定了沙门氏菌和大肠杆菌的关键致病血清型。尽管16SrRNA基因序列的血清型预测不如分离株WGS的血清型预测准确,直接从环境样本的扩增子测序中识别危险的血清型的潜力对病原体监测很有趣。此处开发的功能也广泛地与其他应用相关,其中物种内变异和环境样品的直接测序可能很有价值。重要提示为了预防和阻止食源性病原体的爆发,重要的是,我们可以检测致病菌何时存在于食物或食物相关部位,并确定存在于不同样品中的特定致病菌之间的联系。在这项工作中,我们开发了一种新的计算技术,可以通过单标记基因测序对重要的食源性病原体大肠杆菌和肠道沙门氏菌进行血清分型(亚种水平分类),和16SrRNA基因通常用于监视细菌群落。我们的结果表明,目前仅从16SrRNA基因测序进行血清分型的局限性,但为进一步的进展奠定了基础,我们认为这可能是因为我们的工作利用了长读测序技术和基因组数据库的快速发展。如果这个研究方向成功,它可以在食源性病原体到达公众面前更好地检测它们,并加快食源性病原体爆发的解决速度。
公众号