半衰期是一个重要的药代动力学参数,包括在吸收的排泄阶段,分布,新陈代谢,和排泄。这是成功销售候选药物的关键因素之一。因此,预测半衰期在药物设计中具有重要意义。在这项研究中,我们采用了极限梯度提升(XGboost),随机森林(RF),梯度增压机(GBM),和支持向量机(SVM)对3512个化合物建立定量构效关系(QSAR)模型,并利用均方根误差(RMSE)评估模型性能,R2、平均绝对误差(MAE)指标和由SHapley加法扩张(SHAP)解释的特征。此外,我们通过整合4个个体模型开发了共识模型,并使用Y随机化检验和适用性领域分析验证了其性能.最后,配对分子对分析用于提取转化规则。我们的结果表明,XGboost优于其他个体模型(RMSE=0.176,R2=0.845,MAE=0.141)。整合所有四个模型的共识模型继续增强预测性能(RMSE=0.172,R2=0.856,MAE=0.138)。我们评估了可靠性,鲁棒性,通过Y随机化检验和适用性领域分析,提高泛化能力。同时,我们利用SHAP解释特征,并利用匹配分子对分析提取化学转化规则,为优化药物结构提供建议.总之,我们认为,这项研究中开发的共识模型是评估药物发现半衰期的可靠工具,本研究得出的化学转化规律可为药物发现提供有价值的建议。
Half-life is a significant pharmacokinetic parameter included in the excretion phase of absorption, distribution, metabolism, and excretion. It is one of the key factors for the successful marketing of drug candidates. Therefore, predicting half-life is of great significance in drug design. In this study, we employed eXtreme Gradient Boosting (XGboost), randomForest (RF), gradient boosting machine (GBM), and supporting vector machine (SVM) to build quantitative structure-activity relationship (QSAR) models on 3512 compounds and evaluated model performance by using root-mean-square error (RMSE), R2, and mean absolute error (MAE) metrics and interpreted features by SHapley Additive exPlanation (SHAP). Furthermore, we developed
consensus models through integrating four individual models and validated their performance using a Y-randomization test and applicability domain analysis. Finally, matched molecular pair analysis was used to extract the transformation rules. Our results revealed that XGboost outperformed other individual models (RMSE = 0.176, R2 = 0.845, MAE = 0.141). The
consensus model integrating all four models continued to enhance prediction performance (RMSE = 0.172, R2 = 0.856, MAE = 0.138). We evaluated the reliability, robustness, and generalization ability via Y-randomization test and applicability domain analysis. Meanwhile, we utilized SHAP to interpret features and employed matched molecular pair analysis to extract chemical transformation rules that provide suggestions for optimizing drug structure. In conclusion, we believe that the
consensus model developed in this study serve as a reliable tool to evaluate half-life in drug discovery, and the chemical transformation rules concluded in this study could provide valuable suggestions in drug discovery.