关键词: ADME Machine learning Multi-task learning Pharmacokinetics QSAR

来  源:   DOI:10.1002/minf.202400079

Abstract:
ADME (Absorption, Distribution, Metabolism, Excretion) properties are key parameters to judge whether a drug candidate exhibits a desired pharmacokinetic (PK) profile. In this study, we tested multi-task machine learning (ML) models to predict ADME and animal PK endpoints trained on in-house data generated at Boehringer Ingelheim. Models were evaluated both at the design stage of a compound (i. e., no experimental data of test compounds available) and at testing stage when a particular assay would be conducted (i. e., experimental data of earlier conducted assays may be available). Using realistic time-splits, we found a clear benefit in performance of multi-task graph-based neural network models over single-task model, which was even stronger when experimental data of earlier assays is available. In an attempt to explain the success of multi-task models, we found that especially endpoints with the largest numbers of data points (physicochemical endpoints, clearance in microsomes) are responsible for increased predictivity in more complex ADME and PK endpoints. In summary, our study provides insight into how data for multiple ADME/PK endpoints in a pharmaceutical company can be best leveraged to optimize predictivity of ML models.
摘要:
ADME(吸收,Distribution,代谢,排泄)特性是判断候选药物是否表现出期望的药代动力学(PK)概况的关键参数。在这项研究中,我们测试了多任务机器学习(ML)模型,以预测在勃林格氏英格尔海姆生成的内部数据上训练的ADME和动物PK终点。在化合物的设计阶段都对模型进行了评估(i。Procedures.,没有可用的测试化合物的实验数据),并且在进行特定测定的测试阶段(i。Procedures.,早期进行的测定的实验数据可能是可用的)。使用现实的时间分割,我们发现,与单任务模型相比,基于多任务图的神经网络模型在性能上有明显的优势,当早期测定的实验数据可用时,这一点甚至更强。为了解释多任务模型的成功,我们发现,尤其是数据点数量最多的端点(物理化学端点,微粒体中的清除率)导致更复杂的ADME和PK终点的预测性增加。总之,我们的研究深入了解了如何最好地利用制药公司多个ADME/PK终点的数据来优化ML模型的预测性.
公众号