机器学习在预测大肠杆菌抗菌素耐药性中的普适性：非洲的多国案例研究。Generalizability of machine learning in predicting antimicrobial resistance in E. coli: a multi-country case study in Africa.-医云文献数字医云科研云海量医学决策数据服务

Abstract：

BACKGROUND: Antimicrobial resistance (AMR) remains a significant global health threat particularly impacting low- and middle-income countries (LMICs). These regions often grapple with limited healthcare resources and access to advanced diagnostic tools. Consequently, there is a pressing need for innovative approaches that can enhance AMR surveillance and management. Machine learning (ML) though underutilized in these settings, presents a promising avenue. This study leverages ML models trained on whole-genome sequencing data from England, where such data is more readily available, to predict AMR in E. coli, targeting key antibiotics such as ciprofloxacin, ampicillin, and cefotaxime. A crucial part of our work involved the validation of these models using an independent dataset from Africa, specifically from Uganda, Nigeria, and Tanzania, to ascertain their applicability and effectiveness in LMICs.
RESULTS: Model performance varied across antibiotics. The Support Vector Machine excelled in predicting ciprofloxacin resistance (87% accuracy, F1 Score: 0.57), Light Gradient Boosting Machine for cefotaxime (92% accuracy, F1 Score: 0.42), and Gradient Boosting for ampicillin (58% accuracy, F1 Score: 0.66). In validation with data from Africa, Logistic Regression showed high accuracy for ampicillin (94%, F1 Score: 0.97), while Random Forest and Light Gradient Boosting Machine were effective for ciprofloxacin (50% accuracy, F1 Score: 0.56) and cefotaxime (45% accuracy, F1 Score:0.54), respectively. Key mutations associated with AMR were identified for these antibiotics.
CONCLUSIONS: As the threat of AMR continues to rise, the successful application of these models, particularly on genomic datasets from LMICs, signals a promising avenue for improving AMR prediction to support large AMR surveillance programs. This work thus not only expands our current understanding of the genetic underpinnings of AMR but also provides a robust methodological framework that can guide future research and applications in the fight against AMR.

摘要：

背景：抗菌素耐药性（AMR）仍然是一个重大的全球健康威胁，尤其影响低收入和中等收入国家（LMICs）。这些地区经常面临有限的医疗资源和先进的诊断工具。因此,迫切需要可以加强AMR监测和管理的创新方法。机器学习(ML)虽然在这些设置中没有得到充分利用，提出了一个有希望的途径。这项研究利用了来自英格兰的全基因组测序数据训练的ML模型，在这些数据更容易获得的地方，来预测大肠杆菌中的AMR，针对环丙沙星等关键抗生素，氨苄青霉素,和头孢噻肟.我们工作的关键部分涉及使用来自非洲的独立数据集验证这些模型，特别是来自乌干达，尼日利亚,坦桑尼亚,以确定它们在低收入国家的适用性和有效性。
结果：模型性能因抗生素而异。支持向量机在预测环丙沙星耐药性方面表现出色(准确率为87%，F1得分：0.57），头孢噻肟光梯度升压机(92%精度，F1得分：0.42），和氨苄青霉素的梯度提升(58%的准确率，F1得分：0.66）。用非洲的数据验证，Logistic回归显示氨苄青霉素的准确性高(94%，F1得分：0.97），而随机森林和光梯度升压机对环丙沙星有效(50%的准确度，F1评分：0.56）和头孢噻肟（准确率为45%，F1得分：0.54），分别。鉴定了这些抗生素的与AMR相关的关键突变。
结论：随着AMR的威胁不断增加，这些模型的成功应用，特别是来自LMIC的基因组数据集，这标志着改善AMR预测以支持大型AMR监测计划的有希望的途径。因此，这项工作不仅扩展了我们目前对AMR遗传基础的理解，而且提供了一个强大的方法论框架，可以指导未来在对抗AMR方面的研究和应用。