Consensus modelling

共识建模
  • 文章类型: Journal Article
    现代社会的各个领域都受到氟化学的影响。特别是,氟在医疗中起着重要的作用,制药和农业化学科学。在各种氟有机化合物中,三氟甲基(CF3)基团在制药等应用中很有价值,农用化学品和工业化学品。在本研究中,遵循严格的经合组织建模原则,通过遗传算法-多元线性回归(GA-MLR)方法,建立了大鼠三氟甲基化合物(TFM)急性口服毒性的定量结构-毒性关系(QSTR)模型。所有开发的模型都通过各种最新的验证指标和OECD原则进行了评估。最佳QSTR模型包括9个易于解释的2D分子描述符,具有明确的物理和化学意义。机理解释表明,原子型电拓扑状态指数,分子连通性,电离电势,亲脂性和一些自相关系数是TFM对大鼠急性口服毒性的主要因素。为了验证选定的2D描述符可以有效地表征毒性,我们进行了化学阅读分析.我们还将最佳QSTR模型与公共OPERA工具进行了比较,以证明预测的可靠性。为了进一步提高QSTR模型的预测范围,我们进行了共识建模。最后,最佳QSTR模型首次用于预测包含许多未测试/未知TFM的真实外部集。总的来说,所开发的模型有助于对新型含CF3的药物或化学品进行更全面的安全评估方法,减少不必要的化学合成,同时节省新药的开发成本。
    All areas of the modern society are affected by fluorine chemistry. In particular, fluorine plays an important role in medical, pharmaceutical and agrochemical sciences. Amongst various fluoro-organic compounds, trifluoromethyl (CF3) group is valuable in applications such as pharmaceuticals, agrochemicals and industrial chemicals. In the present study, following the strict OECD modelling principles, a quantitative structure-toxicity relationship (QSTR) modelling for the rat acute oral toxicity of trifluoromethyl compounds (TFMs) was established by genetic algorithm-multiple linear regression (GA-MLR) approach. All developed models were evaluated by various state-of-the-art validation metrics and the OECD principles. The best QSTR model included nine easily interpretable 2D molecular descriptors with clear physical and chemical significance. The mechanistic interpretation showed that the atom-type electro-topological state indices, molecular connectivity, ionization potential, lipophilicity and some autocorrelation coefficients are the main factors contributing to the acute oral toxicity of TFMs against rats. To validate that the selected 2D descriptors can effectively characterize the toxicity, we performed the chemical read-across analysis. We also compared the best QSTR model with public OPERA tool to demonstrate the reliability of the predictions. To further improve the prediction range of the QSTR model, we performed the consensus modelling. Finally, the optimum QSTR model was utilized to predict a true external set containing many untested/unknown TFMs for the first time. Overall, the developed model contributes to a more comprehensive safety assessment approach for novel CF3-containing pharmaceuticals or chemicals, reducing unnecessary chemical synthesis whilst saving the development cost of new drugs.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    α-Amylase (EC.3.2.1.1) is a ubiquitous digestive endoamylase. The abrupt rise in blood glucose levels due to the hydrolysis of carbohydrates by α-amylase at a faster rate is one of the main reasons for type 2 diabetes. The inhibitors prevent the action of digestive enzymes, slowing the digestion of carbs and eventually assisting in the management of postprandial hyperglycemia. In the course of developing α-amylase inhibitors, we have screened 2-aryliminothiazolidin-4-one based analogs for their in vitro α-amylase inhibitory potential and employed various in silico approaches for the detailed exploration of the bioactivity. The DNSA bioassay revealed that compounds 5c, 5e, 5h, 5j, 5m, 5o and 5t were more potent than the reference drug (IC60 value = 22.94 ± 0.24 μg mL-1). The derivative 5o with -NO2 group at both the rings was the most potent analog with an IC60 value of 19.67 ± 0.20 μg mL-1 whereas derivative 5a with unsubstituted aromatic rings showed poor inhibitory potential with an IC60 value of 33.40 ± 0.15 μg mL-1. The reliable QSAR models were developed using the QSARINS software. The high value of R2ext = 0.9632 for model IM-9 showed that the built model can be applied to predict the α-amylase inhibitory activity of the untested molecules. A consensus modelling approach was also employed to test the reliability and robustness of the developed QSAR models. Molecular docking and molecular dynamics were employed to validate the bioassay results by studying the conformational changes and interaction mechanisms. A step further, these compounds also exhibited good ADMET characteristics and bioavailability when tested for in silico pharmacokinetics prediction parameters.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    稠合和非稠合多环芳烃(FNFPAHs)是一类广泛存在于环境中的有机化合物,对生态系统和公共卫生构成潜在危害。因此受到各种监管机构的广泛关注。这里,定量结构-活性关系(QSAR)模型被构建为FNFPAHs对两种水生物种的生态毒性模型,大型水蚤和Oncorhynchusmykiss。根据严格的经合组织准则,我们使用遗传算法(GA)加多元线性回归(MLR)方法建立了两个水生毒性终点的QSAR模型:D.magna(48hLC50)和O.mykiss(96hLC50)。使用具有明确物理化学意义的简单2D描述符建立模型,并使用各种内部/外部验证指标进行评估。结果清楚地表明,两个模型在统计上都是稳健的(D.magna的QLOO2=0.7834,O.mykiss的QLOO2=0.8162),具有良好的内部适应性(D.magna的R2=0.8159,O.mykiss的R2=0.8626和外部预测能力(D.麦格纳:Rtest2=0.8259,QFn2=0.7640~0.8140,CCCtest=0.8972;O.mykiss:Rtest2=0.8077,QFn2=0.7615~0.7722,CCCtest=0.8910)。为了证明所开发模型的预测性能,与标准ECOSAR工具的额外比较显然表明,我们的模型具有较低的RMSE值。随后,我们利用最佳模型来预测从PPDB数据库收集的真实外集化合物,以进一步填补毒性数据空白.此外,整合所有经过验证的单个模型(IM)的共识模型(CM)比IM更具外部预测性,其中CM2对两种水生物种的预测性能最好。总的来说,这里提出的模型可用于评估适用性领域(AD)内的未知FNFPAHs,因此对于当前监管框架下的环境风险评估非常重要。
    Fused and non-fused polycyclic aromatic hydrocarbons (FNFPAHs) are a type of organic compounds widely occurring in the environment that pose a potential hazard to ecosystem and public health, and thus receive extensive attention from various regulatory agencies. Here, quantitative structure-activity relationship (QSAR) models were constructed to model the ecotoxicity of FNFPAHs against two aquatic species, Daphnia magna and Oncorhynchus mykiss. According to the stringent OECD guidelines, we used genetic algorithm (GA) plus multiple linear regression (MLR) approach to establish QSAR models of the two aquatic toxicity endpoints: D. magna (48 h LC50) and O. mykiss (96 h LC50). The models were established using simple 2D descriptors with explicit physicochemical significance and evaluated using various internal/external validation metrics. The results clearly show that both models are statistically robust (QLOO2 = 0.7834 for D. magna and QLOO2 = 0.8162 for O. mykiss), have good internal fitness (R2 = 0.8159 for D. magna and R2 = 0.8626 for O. mykiss and external predictive ability (D. magna: Rtest2 = 0.8259, QFn2 = 0.7640∼0.8140, CCCtest = 0.8972; O. mykiss:Rtest2 = 0.8077, QFn2 = 0.7615∼0.7722, CCCtest = 0.8910). To prove the predictive performance of the developed models, an additional comparison with the standard ECOSAR tool obviously shows that our models have lower RMSE values. Subsequently, we utilized the best models to predict the true external set compounds collected from the PPDB database to further fill the toxicity data gap. In addition, consensus models (CMs) that integrate all validated individual models (IMs) were more externally predictive than IMs, of which CM2 has the best prediction performance towards the two aquatic species. Overall, the models presented here could be used to evaluate unknown FNFPAHs inside the domain of applicability (AD), thus being very important for environmental risk assessment under current regulatory frameworks.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在本研究中,九十五种卤化二恶英和相关化学品(二苯并对二恶英,二苯并呋喃,联苯,和萘)与终点pEC50一起使用CORAL软件的内置蒙特卡罗算法开发了十二个定量结构毒性关系(QSTR)模型。使用SMILES和HSG(氢抑制图)的组合使用相关权重的混合最佳描述符(DCW)来生成QSTR模型。三个目标函数,即TF1(WIIC=WCII=0),使用TF2(WIIC=0.3&WCII=0)和TF3(WIIC=0.0&WCII=0.3)来开发稳健的QSTR模型,并将每个目标函数的统计结果相互比较。发现相关强度指数(CII)是QSTR模型预测潜力的可靠基准。发现由TF3计算的分裂1的验证集的测定系数的数值最高(Rvalid2=0.8438)。还根据pEC50的增加/减少的促进剂鉴定了造成二恶英和相关化学物质毒性的片段。选择三个随机分裂(分裂1、分裂2和分裂4)用于提取pEC50的增加/减少启动子。在最后,共识建模使用DTC实验室的智能共识工具(https://dtclab。webs.com/software-tools).最初的共识模式,它是通过使用分裂4排列组合四个不同的模型创建的,对验证集更具预测性,测试集(验证集)的决定系数数值从0.8133增加到0.9725。对于分裂4的验证组,平均绝对误差(MAE100%)也从0.513降低至0.2739。
    In the present study, ninety-five halogenated dioxins and related chemicals (dibenzo-p-dioxins, dibenzofurans, biphenyls, and naphthalene) with endpoint pEC50 were used to develop twelve quantitative structure toxicity relationship (QSTR) models using inbuilt Monte Carlo algorithm of CORAL software. The hybrid optimal descriptor of correlation weights (DCW) using a combination of SMILES and HSG (hydrogen suppressed graph) was employed to generate QSTR models. Three target functions i.e. TF1 (WIIC=WCII=0), TF2 (WIIC= 0.3 & WCII=0) and TF3 (WIIC= 0.0 &WCII=0.3) were employed to develop robust QSTR models and the statistical outcomes of each target function were compared with each other. The correlation intensity index (CII) was found a reliable benchmark of the predictive potential for QSTR models. The numerical value of the determination coefficient of the validation set of split 1 computed by TF3 was found highest (RValid2=0.8438). The fragments responsible for the toxicity of dioxins and related chemicals were also identified in terms of the promoter of increase/decrease for pEC50. Three random splits (Split 1, Split 2 and Split 4) were selected for the extraction of the promoter of increase/decrease for pEC50. In the last, consensus modelling was performed using the intelligent consensus tool of DTC lab (https://dtclab.webs.com/software-tools). The original consensus model, which was created by combining four distinct models employing the split 4 arrangement, was more predictive for the validation set and the numerical value of the determination coefficient of the test set (validation set) was increased from 0.8133 to 0.9725. For the validation set of split 4, the mean absolute error (MAE 100%) was also lowered from 0.513 to 0.2739.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景技术低共熔溶剂(DES)通常被认为是绿色可持续的替代溶剂,并且目前大规模用于许多工业应用中。考虑到DES的工业重要性,并且由于绝大多数DES尚未合成,因此有效地分析其密度的化学信息学模型和工具的开发变得至关重要。在这项工作中,经过严格的验证,提出了定量结构-性质关系(QSPR)模型,用于估计各种DES的密度。这些模型基于先前用于构建相同终点的热力学模型的建模数据集。最好的QSPR模型是健壮和健全的,在外部验证集上表现良好(设置有最近报告的DES实验密度数据)。此外,结果表明,结构特征可能在决定DES密度中起关键作用。然后,采用智能共识预测来开发具有提高预测准确性的共识模型。所有模型都是使用公开可用的工具得出的,以促进所提出方法的简单可重复性。未来的工作可能涉及建立可靠的,DES其他热力学性质的可解释化学信息学模型,并指导这些溶剂的应用设计。
    Deep eutectic solvents (DES) are often regarded as greener sustainable alternative solvents and are currently employed in many industrial applications on a large scale. Bearing in mind the industrial importance of DES-and because the vast majority of DES has yet to be synthesized-the development of cheminformatic models and tools efficiently profiling their density becomes essential. In this work, after rigorous validation, quantitative structure-property relationship (QSPR) models were proposed for use in estimating the density of a wide variety of DES. These models were based on a modelling dataset previously employed for constructing thermodynamic models for the same endpoint. The best QSPR models were robust and sound, performing well on an external validation set (set up with recently reported experimental density data of DES). Furthermore, the results revealed structural features that could play crucial roles in ruling DES density. Then, intelligent consensus prediction was employed to develop a consensus model with improved predictive accuracy. All models were derived using publicly available tools to facilitate easy reproducibility of the proposed methodology. Future work may involve setting up reliable, interpretable cheminformatic models for other thermodynamic properties of DES and guiding the design of these solvents for applications.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    The assessment of cytotoxicity of quantum dots is very essential for environmental and health risk analysis. In the present work we have modelled HeLa cell cytotoxicity of sixty one CdSe quantum dots with ZnS shell as a function of its experimental conditions and molecular construction using quasiSMILES representations. The index of ideality of correlation helps in the building of ten statistically significant models having good fitting ability with value of R2 ranging from 0.8414 to 0.9609 for the training set. The split 5 model is rated as the best model with values of R2, Q2F1, Q2F2 and Q2F3 as 0.8964, 0.8267, 0.8264 and 0.8777 respectively for the calibration set. The extraction of features causing increase and decrease of cytotoxicity of quantum dots indicates importance of neutral surface charge, surface modified with protein, 72 h exposure time, combination of MTT assay with surface protein in decreasing the cytotoxicity. Amphiphilic polymer, polyol ligand with neutral charge, 0.5 - 0.6 nm quantum dot diameter with lipid ligand and unmodified positively charged surface are grouped in toxicity enhancer features. Further, consensus modelling using split 5 and 8 patterns enhances the prediction quality by increasing the R2val to 0.9361 and 0.9656 respectively.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    玻璃化转变温度是聚合物的重要性质,直接影响其稳定性。在本研究中,我们使用206种不同聚合物的数据集建立了定量结构-性质关系模型,用于预测聚合物的玻璃化转变温度。从聚合物的单个重复单元计算各种2D分子描述符。通过采用双交叉验证技术,然后进行偏最小二乘回归,我们从每种情况下六个描述符的不同组合中得出了五个模型。随后通过交叉验证、使用测试集化合物的外部验证,Y随机化(Y加扰)检验和开发模型的适用性领域研究。所有模型都具有统计上显著的度量值,例如r2在0.713-0.759的范围内,Q2在0.662-0.724的范围内,并且[公式:参见文本]在0.702-0.805的范围内。最后,与最近发布的模型进行了比较,尽管以前的模型是基于一个小得多的数据集,多样性有限。我们还使用了一个真实的外部集来展示我们开发的模型的性能,可用于在合成之前预测和设计新型聚合物。
    The glass transition temperature is a vital property of polymers with a direct impact on their stability. In the present study, we built quantitative structure-property relationship models for the prediction of the glass transition temperatures of polymers using a data set of 206 diverse polymers. Various 2D molecular descriptors were computed from the single repeating units of polymers. We derived five models from different combinations of six descriptors in each case by employing the double cross-validation technique followed by partial least squares regression. The selected models were subsequently validated by methods such as cross-validation, external validation using test set compounds, the Y-randomization (Y-scrambling) test and an applicability domain study of the developed models. All of the models have statistically significant metric values such as r2 ranging from 0.713-0.759, Q2 ranging from 0.662-0.724 and [Formula: see text] ranging 0.702-0.805. Finally, a comparison was made with recently published models, though the previous models were based on a much smaller data set with limited diversity. We also used a true external set to demonstrate the performance of our developed models, which may be used for the prediction and design of novel polymers prior to their synthesis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    Aquatic bioconcentration factors (BCFs) are critical in PBT (persistent, bioaccumulative, toxic) and risk assessment of chemicals. High costs and use of more than 100 fish per standard BCF study (OECD 305) call for alternative methods to replace as much in vivo testing as possible. The BCF waiving scheme is a screening tool combining QSAR classifications based on physicochemical properties related to the distribution (hydrophobicity, ionisation), persistence (biodegradability, hydrolysis), solubility and volatility (Henry\'s law constant) of substances in water bodies and aquatic biota to predict substances with low aquatic bioaccumulation (nonB, BCF<2000). The BCF waiving scheme was developed with a dataset of reliable BCFs for 998 compounds and externally validated with another 181 substances. It performs with 100% sensitivity (no false negatives), >50% efficacy (waiving potential), and complies with the OECD principles for valid QSARs. The chemical applicability domain of the BCF waiving scheme is given by the structures of the training set, with some compound classes explicitly excluded like organometallics, poly- and perfluorinated compounds, aromatic triphenylphosphates, surfactants. The prediction confidence of the BCF waiving scheme is based on applicability domain compliance, consensus modelling, and the structural similarity with known nonB and B/vB substances. Compounds classified as nonB by the BCF waiving scheme are candidates for waiving of BCF in vivo testing on fish due to low concern with regard to the B criterion. The BCF waiving scheme supports the 3Rs with a possible reduction of >50% of BCF in vivo testing on fish. If the target chemical is outside the applicability domain of the BCF waiving scheme or not classified as nonB, further assessments with in silico, in vitro or in vivo methods are necessary to either confirm or reject bioaccumulative behaviour.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号