关键词: LpxC QSAR activity cliff antimicrobial resistance cheminformatics chemotype machine learning

来  源:   DOI:10.17179/excli2023-6356   PDF(Pubmed)

Abstract:
Antimicrobial resistance (AMR) has emerged as one of the global threats to human health in the 21st century. Drug discovery of inhibitors against novel targets rather than conventional bacterial targets has been considered an inevitable strategy for the growing threat of AMR infections. In this study, we applied quantitative structure-activity relationship (QSAR) modeling to the LpxC inhibitors to predict the inhibitory activity. In addition, we performed various cheminformatics analysis consisting of the exploration of the chemical space, identification of chemotypes, performing structure-activity landscape and activity cliffs as well as construction of the Structure-Activity Similarity (SAS) map. We built a total of 24 QSAR classification models using PubChem and MACCS fingerprint with 12 various machine learning algorithms. The best model with PubChem fingerprint is the Extremely Gradient Boost model (accuracy on the training set: 0.937; accuracy on the 10-fold cross-validation set: 0.795; accuracy on the test set: 0.799). Furthermore, it was found that the best model using the MACCS fingerprint was the Random Forest model (accuracy on the training set: 0.955; accuracy on the 10-fold cross-validation set: 0.803; accuracy on the test set: 0.785). In addition, we have identified eight consensus activity cliff generators that are highly informative for further SAR investigations. It is hoped that findings presented herein can provide guidance for further lead optimization of LpxC inhibitors.
摘要:
抗菌素耐药性(AMR)已成为21世纪人类健康面临的全球性威胁之一。针对新靶标而不是常规细菌靶标的抑制剂的药物发现被认为是AMR感染威胁日益增长的不可避免的策略。在这项研究中,我们将定量结构-活性关系(QSAR)模型应用于LpxC抑制剂以预测抑制活性。此外,我们进行了各种化学信息学分析,包括对化学空间的探索,化学型的鉴定,执行结构-活动景观和活动悬崖,以及结构-活动相似性(SAS)图的构建。我们使用PubChem和MACCS指纹以及12种不同的机器学习算法构建了总共24个QSAR分类模型。具有PubChem指纹的最佳模型是极梯度提升模型(训练集上的准确度:0.937;10倍交叉验证集上的准确度:0.795;测试集上的准确度:0.799)。此外,发现使用MACCS指纹的最佳模型是随机森林模型(训练集上的准确度:0.955;10倍交叉验证集上的准确度:0.803;测试集上的准确度:0.785).此外,我们已经确定了八个共识活动悬崖生成器,这些生成器为进一步的SAR调查提供了大量信息。希望本文提出的发现可以为LpxC抑制剂的进一步前导优化提供指导。
公众号