背景:肠道沙门氏菌是全球公共卫生的主要负担之一。低于物种水平的沙门氏菌分型对于不同的目的是至关重要的,但是传统的方法很昂贵,技术要求,耗时,因此仅限于参考中心。傅里叶变换红外(FTIR)光谱是细菌分型的一种替代方法,成功地应用于不同物种级别的分类。
目的:本研究旨在解决使用FTIR光谱在O-血清群水平上对肠道沙门氏菌进行分型的挑战。我们应用机器学习开发了一种新的肠杆菌分型方法,使用基于FTIR的IRBiotyper®系统(IRBT;BrukerDaltonicsGmbH&Co.KG,德国)。我们调查了一个多中心的分离株,我们将这种新方法与经典的基于血清分型的方法和分子方法进行了比较。
方法:共有958个特征良好的沙门氏菌分离株(25个血清群,138血清型),收集在11个不同的中心(在欧洲和日本),从临床,本研究包括环境和食物样本,并通过IRBT进行分析。从水-乙醇细菌悬浮液中获得红外吸收光谱,来自在7种不同的琼脂培养基上生长的培养分离物。在研究的第一部分,通过与参考分型方法进行比较,评估了IRBT系统的鉴别潜力.在研究的第二部分,IRBT软件的人工智能功能被应用于在血清群水平开发沙门氏菌分离株的分类器.研究了不同的机器学习算法(人工神经网络和支持向量机)。训练集中包括88个预先表征的分离株(对应于25个血清群和53个血清变型)的子集。剩余的870个样品用作验证集。分类器在准确性方面进行了评估,错误率和失败分类率。
结果:在交叉验证中提供最高准确性的分类器被选择使用四个外部测试集进行测试。考虑到所有的测试地点,非选择性培养基的准确度范围为97.0%至99.2%,选择性培养基为94.7%至96.4%。
结论:IRBT系统被证明是非常有前途的,用户友好,和具有成本效益的工具,用于在血清群水平上进行沙门氏菌分型。机器学习算法的应用证明了一种新颖的打字方法,这依赖于自动分析和结果解释,因此,它没有潜在的人类偏见。该系统对常规工作流程具有很高的鲁棒性和适应性,不需要训练有素的人员,并证明适合与生长在不同琼脂培养基上的分离物一起应用,有选择性和无选择性。目前正在进行的临床进一步测试,在将其作为常规使用的潜在独立标准方法实施之前,食品和环境隔离物是必要的。
BACKGROUND: Salmonella enterica is among the major burdens for public health at global level. Typing of salmonellae below the species level is fundamental for different purposes, but traditional methods are expensive, technically demanding, and time-consuming, and therefore limited to reference centers. Fourier transform infrared (FTIR) spectroscopy is an alternative method for bacterial typing, successfully applied for classification at different infra-species levels.
OBJECTIVE: This study aimed to address the challenge of subtyping Salmonella enterica at O-serogroup level by using FTIR spectroscopy. We applied machine learning to develop a novel approach for S. enterica typing, using the FTIR-based IR Biotyper® system (IRBT; Bruker Daltonics GmbH & Co. KG, Germany). We investigated a multicentric collection of isolates, and we compared the novel approach with classical serotyping-based and molecular methods.
METHODS: A total of 958 well characterized Salmonella isolates (25 serogroups, 138 serovars), collected in 11 different centers (in Europe and Japan), from clinical, environmental and food samples were included in this study and analyzed by IRBT. Infrared absorption spectra were acquired from water-ethanol bacterial suspensions, from culture isolates grown on seven different agar media. In the first part of the study, the discriminatory potential of the IRBT system was evaluated by comparison with reference typing method/s. In the second part of the study, the artificial intelligence capabilities of the IRBT software were applied to develop a classifier for Salmonella isolates at serogroup level. Different machine learning algorithms were investigated (artificial neural networks and support vector machine). A subset of 88 pre-characterized isolates (corresponding to 25 serogroups and 53 serovars) were included in the training set. The remaining 870 samples were used as validation set. The classifiers were evaluated in terms of accuracy, error rate and failed classification rate.
RESULTS: The classifier that provided the highest accuracy in the cross-validation was selected to be tested with four external testing sets. Considering all the testing sites, accuracy ranged from 97.0% to 99.2% for non-selective media, and from 94.7% to 96.4% for selective media.
CONCLUSIONS: The IRBT system proved to be a very promising, user-friendly, and cost-effective tool for Salmonella typing at serogroup level. The application of machine learning algorithms proved to enable a novel approach for typing, which relies on automated analysis and result interpretation, and it is therefore free of potential human biases. The system demonstrated a high robustness and adaptability to routine workflows, without the need of highly trained personnel, and proving to be suitable to be applied with isolates grown on different agar media, both selective and unselective. Further tests with currently circulating clinical, food and environmental isolates would be necessary before implementing it as a potentially stand-alone standard method for routine use.