关键词: CatBoost Trypanosoma brucei carbazole-derived compounds comprehensive learning particle swarm optimization quantitative structure–activity relationship support vector regression

Mesh : Carbazoles / pharmacology Support Vector Machine Quantitative Structure-Activity Relationship Trypanocidal Agents / pharmacology

来  源:   DOI:10.1128/aac.00265-24   PDF(Pubmed)

Abstract:
In order to predict the anti-trypanosome effect of carbazole-derived compounds by quantitative structure-activity relationship, five models were established by the linear method, random forest, radial basis kernel function support vector machine, linear combination mix-kernel function support vector machine, and nonlinear combination mix-kernel function support vector machine (NLMIX-SVM). The heuristic method and optimized CatBoost were used to select two different key descriptor sets for building linear and nonlinear models, respectively. Hyperparameters in all nonlinear models were optimized by comprehensive learning particle swarm optimization with low complexity and fast convergence. Furthermore, the models\' robustness and reliability underwent rigorous assessment using fivefold and leave-one-out cross-validation, y-randomization, and statistics including concordance correlation coefficient (CCC), [Formula: see text] , [Formula: see text] , and [Formula: see text] . Among all the models, the NLMIX-SVM model, which was established by support vector regression using a nonlinear combination of radial basis kernel function, sigmoid kernel function, and linear kernel function as a new kernel function, demonstrated excellent learning and generalization abilities as well as robustness: [Formula: see text] = 0.9581, mean square error (MSE) = 0.0199 for the training set and [Formula: see text] = 0.9528, MSE = 0.0174 for the test set. [Formula: see text] , [Formula: see text] , CCC, [Formula: see text] , [Formula: see text], and [Formula: see text] are 0.9539, 0.8908, 0.9752, 0.9529, 0.9528, and 0.9633, respectively. The NLMIX-SVM method proved to be a promising way in quantitative structure-activity relationship research. In addition, molecular docking experiments were conducted to analyze the properties of new derivatives, and a new potential candidate drug molecule was ultimately found. In summary, this study will provide help for the design and screening of novel anti-trypanosome drugs.
摘要:
为了通过定量构效关系预测咔唑衍生化合物的抗锥虫作用,通过线性方法建立了五个模型,随机森林,径向基核函数支持向量机,线性组合混合核函数支持向量机,和非线性组合混合核函数支持向量机(NLMIX-SVM)。启发式方法和优化的CatBoost被用来选择两个不同的关键描述符集,用于建立线性和非线性模型,分别。采用综合学习粒子群算法对所有非线性模型中的超参数进行优化,算法复杂度低,收敛速度快。此外,模型的健壮性和可靠性经过严格的评估,使用五倍和留一法交叉验证,y-随机化,和统计数据,包括一致性相关系数(CCC),[公式:见正文],[公式:见正文],和[公式:见正文]。在所有的模型中,NLMIX-SVM模型,这是通过支持向量回归使用径向基核函数的非线性组合来建立的,sigmoid核函数,和线性核函数作为一个新的核函数,展示了出色的学习和泛化能力以及鲁棒性:[公式:请参见文本]=0.9581,均方误差(MSE)=0.0199的训练集和[公式:请参见文本]=0.9528,MSE=0.0174的测试集。[公式:见正文],[公式:见正文],CCC,[公式:见正文],[公式:见正文],和[公式:见正文]分别为0.9539、0.8908、0.9752、0.9529、0.9528和0.9633。NLMIX-SVM方法被证明是定量结构-活性关系研究中的一种有前途的方法。此外,分子对接实验分析了新衍生物的性质,并最终发现了一种新的潜在候选药物分子。总之,本研究将为新型抗锥虫药物的设计和筛选提供帮助。
公众号