关键词: Abraham solvation model Applicability domain Chemical properties Henry’s Law constant Octanol–water partitioning PPLFER Prediction uncertainty QSPR Solubility Vapor pressure

来  源:   DOI:10.1186/s13321-024-00853-w   PDF(Pubmed)

Abstract:
This study describes the development and evaluation of six new models for predicting physical-chemical (PC) properties that are highly relevant for chemical hazard, exposure, and risk estimation: solubility (in water SW and octanol SO), vapor pressure (VP), and the octanol-water (KOW), octanol-air (KOA), and air-water (KAW) partition ratios. The models are implemented in the Iterative Fragment Selection Quantitative Structure-Activity Relationship (IFSQSAR) python package, Version 1.1.0. These models are implemented as Poly-Parameter Linear Free Energy Relationship (PPLFER) equations which combine experimentally calibrated system parameters and solute descriptors predicted with QSPRs. Two other ancillary models have been developed and implemented, a QSPR for Molar Volume (MV) and a classifier for the physical state of chemicals at room temperature. The IFSQSAR methods for characterizing applicability domain (AD) and calculating uncertainty estimates expressed as 95% prediction intervals (PI) for predicted properties are described and tested on 9,000 measured partition ratios and 4,000 VP and SW values. The measured data are external to IFSQSAR training and validation datasets and are used to assess the predictivity of the models for \"novel chemicals\" in an unbiased manner. The 95% PI intervals calculated from validation datasets for partition ratios needed to be scaled by a factor of 1.25 to capture 95% of the external data. Predictions for VP and SW are more uncertain, primarily due to the challenges in differentiating their physical state (i.e., liquids or solids) at room temperature. The prediction accuracy of the models for log KOW, log KAW and log KOA of novel, data-poor chemicals is estimated to be in the range of 0.7 to 1.4 root mean squared error of prediction (RMSEP), with RMSEP in the range 1.7-1.8 for log VP and log SW. Scientific contributionNew partitioning models integrate empirical PPLFER equations and QSARs, allowing for seamless integration of experimental data and model predictions. This work tests the real predictivity of the models for novel chemicals which are not in the model training or external validation datasets.
摘要:
这项研究描述了六个新模型的开发和评估,用于预测与化学危害高度相关的物理化学(PC)特性。暴露,和风险估计:溶解度(在水中SW和辛醇SO),蒸气压(VP),和辛醇-水(KOW),辛醇-空气(KOA),和空气-水(KAW)分配比。这些模型在迭代片段选择定量结构-活性关系(IFSQSAR)python包中实现,版本1.1.0.这些模型被实现为多参数线性自由能关系(PPLFER)方程,该方程结合了实验校准的系统参数和用QSPR预测的溶质描述符。另外两个辅助模型已经开发和实施,用于摩尔体积(MV)的QSPR和用于化学品在室温下的物理状态的分类器。描述了IFSQSAR方法,用于表征适用性域(AD)并计算以95%预测间隔(PI)表示的不确定性估计值,并在9,000个测量的分配比和4,000个VP和SW值上进行了测试。测量数据是IFSQSAR训练和验证数据集的外部数据,用于以无偏方式评估“新型化学品”模型的预测性。从验证数据集计算出的95%PI间隔需要按1.25的因子缩放以捕获95%的外部数据。对VP和SW的预测更加不确定,主要是由于区分其物理状态的挑战(即,液体或固体)在室温下。对数KOW模型的预测精度,小说的logKAW和logKOA,数据差的化学品估计在0.7到1.4的预测均方根误差(RMSEP)范围内,对数VP和对数SW的RMSEP在1.7-1.8范围内。科学贡献新的划分模型集成了经验PPLFER方程和QSAR,允许实验数据和模型预测的无缝集成。这项工作测试了模型对不在模型训练或外部验证数据集中的新型化学物质的真实预测性。
公众号