关键词: Raman spectroscopy convolutional neural network long short-term memory neural network machine learning surface enhanced Raman spectroscopy

来  源:   DOI:10.3389/fmicb.2021.696921   PDF(Pubmed)

Abstract:
Raman spectroscopy (RS) is a widely used analytical technique based on the detection of molecular vibrations in a defined system, which generates Raman spectra that contain unique and highly resolved fingerprints of the system. However, the low intensity of normal Raman scattering effect greatly hinders its application. Recently, the newly emerged surface enhanced Raman spectroscopy (SERS) technique overcomes the problem by mixing metal nanoparticles such as gold and silver with samples, which greatly enhances signal intensity of Raman effects by orders of magnitudes when compared with regular RS. In clinical and research laboratories, SERS provides a great potential for fast, sensitive, label-free, and non-destructive microbial detection and identification with the assistance of appropriate machine learning (ML) algorithms. However, choosing an appropriate algorithm for a specific group of bacterial species remains challenging, because with the large volumes of data generated during SERS analysis not all algorithms could achieve a relatively high accuracy. In this study, we compared three unsupervised machine learning methods and 10 supervised machine learning methods, respectively, on 2,752 SERS spectra from 117 Staphylococcus strains belonging to nine clinically important Staphylococcus species in order to test the capacity of different machine learning methods for bacterial rapid differentiation and accurate prediction. According to the results, density-based spatial clustering of applications with noise (DBSCAN) showed the best clustering capacity (Rand index 0.9733) while convolutional neural network (CNN) topped all other supervised machine learning methods as the best model for predicting Staphylococcus species via SERS spectra (ACC 98.21%, AUC 99.93%). Taken together, this study shows that machine learning methods are capable of distinguishing closely related Staphylococcus species and therefore have great application potentials for bacterial pathogen diagnosis in clinical settings.
摘要:
拉曼光谱(RS)是一种广泛使用的分析技术,基于在定义的系统中检测分子振动,它产生的拉曼光谱包含系统的独特和高分辨率指纹。然而,正常拉曼散射效应的低强度极大地阻碍了其应用。最近,新出现的表面增强拉曼光谱(SERS)技术通过将金和银等金属纳米颗粒与样品混合来克服这一问题,与常规RS相比,这极大地增强了拉曼效应的信号强度。在临床和研究实验室,SERS提供了一个巨大的潜力,快速,敏感,无标签,以及在适当的机器学习(ML)算法的帮助下进行无损微生物检测和识别。然而,为一组特定的细菌物种选择合适的算法仍然具有挑战性,因为在SERS分析过程中产生的大量数据并不是所有算法都能达到相对较高的精度。在这项研究中,我们比较了三种无监督机器学习方法和10种监督机器学习方法,分别,对来自9种临床重要葡萄球菌属的117株葡萄球菌的2,752个SERS光谱进行研究,以测试不同机器学习方法对细菌快速分化和准确预测的能力。根据结果,基于密度的噪声应用空间聚类(DBSCAN)显示出最佳的聚类能力(兰德指数0.9733),而卷积神经网络(CNN)超过所有其他有监督的机器学习方法,是通过SERS光谱预测葡萄球菌物种的最佳模型(ACC98.21%,AUC99.93%)。一起来看,这项研究表明,机器学习方法能够区分密切相关的葡萄球菌种类,因此在临床环境中的细菌病原体诊断中具有巨大的应用潜力。
公众号